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ABSTRACT 


A  syntax-directed  editor  facilitates  the  creation  of 
programs  in  a  particular  programming  language.  Because  it 
is  based  on  the  syntax  of  the  language,  the  editor  insures 
the  syntactic  correctness  of  edited  programs.  This  paper 
discusses  the  writer’s  development  of  a  table-driven  syntax- 
directed  editor  capable  of  editing  information  structured 
under  virtually  any  context-free  grammar.  Not  only  does 
this  editor  insure  syntactically  correct  programs,  but  it 
also  possesses  limited  translation  capabilities,  both 
between  high-level  languages  and  from  a  high-level  language 
into  a  directly  executable  fora.  The  broader  implications 
of  such  an  editor,  and  of  syntax-directed  editing  in 
general,  are  also  discussed. 
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I.  5 IS TAX- DIRECTED  EDITING  IN  THE  MODERN  PROGRAMMING 

ENVIRONMENT 

A.  PROGRAMMING  ENVIRONMENTS 

Over  the  past  ten  years  or  so,  computer  scientists  have 
been  devoting  increasing  attention  to  the  notion  of  a 
"programming  environment":  the  set  of  software  and  hardware 
tools  available  to  aid  the  programmer  in  the  performance  of 
his  task.  In  the  past,  the  programming  environment  merely 
consisted  of  disjoint  systems  programs  that  the  programmer 
had  to  invoke  deliberately  and  sequentially  to  input,  trans¬ 
late,  and  execute  his  programs.  Today,  however,  environ¬ 
ments  are  designed  so  that  individual  tools  are  both  more 
useful  and  well-integrated  as  parts  of  a  whole,  with  the 
overall  result  that  program  development  is  facilitated 
rather  than  hindered. 

As  late  as  the  1S70's,  program  development  was  an  itera¬ 
tive  and  tedious  process.  Some  of  the  tools  in  a  typical 
programming  environment  were: 

keypunch  machine:  used  to  enter  program  instructions  on 

(usually)  80-column  data  cards; 


card  reader: 


a  machine  used  to  read  the  deck  of  data 


cards  into  the  computer’s  memory; 

compiler:  a  program  that  translated  high-level  language 
programs  into  assembler  language  or  internal  machine 
language.  Note  that  if  translated  into  assembler 
language,  an  assembler  program  was  also  required  for 
conversion  into  machine  language  —  in  fact,  this  program 
sometimes  had  to  be  provided  by  the  programmer  as  part  of 
his  card  deck; 


am 
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(% 
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linkage  editor:  a  program  that  linked  object  programs 

(the  machine  code  produced  by  the  assembler  or  compiler) 
and  certain  control  information  before  loading; 

loader:  a  program  that  loaded  object  modules  and  needed 

library  routines  for  subsequent  execution; 

line  printer :  a  machine  which  usually  produced  the  only 

visual  output  from  the  system  described  above. 

The  tools  described  above  formed  a  strictly  "batch" 
environment.  This  environment  was  not  significantly 
improved  with  the  addition  of  time- sharing ,  which  basically 
involved  the  combination  of  input  and  output  devices  into  a 
teletype-style  terminal.  However,  time-sharing  did  give 
rise  tc  stored  files  of  programs  and  data,  primitive  editing 
features  to  create  these  files,  and  new  control  words  (such 
as  "EON")  to  combine  several  compilation- to-execution 
primitives. 

Consider  the  process  the  programmer  had  to  follow  to 
develop  a  correct  program  using  the  tools  described  above. 
After  designing  an  algorithm  to  solve  his  real-world  problem 
and  selecting  a  programming  language,  he  usually  drafted  the 
program  on  paper  and  desk-checked  its  correctness  by  step¬ 
ping  through  the  program  one  statement  at  a  time.  When  he 
was  satisfied  that  his  program  was  correct,  he  keypunched  it 
onto  data  cards  and  combined  them  with  the  necessary  control 
cards  to  invoke  the  tools  he  desired.  A  typical  card  deck 
included  such  cards  as: 

jo]-!  card:  to  uniguely  identify  the  program  while  in  the 

computer ; 

compiler  card:  to  invoke  the  compiler  of  the  chosen 

language ; 

the  program  cards; 

9 


» 


assembler  card:  to  invoke  the  assembler  (required  if  the 

compiler's  output  was  assembly  language  and  not  machine 
code)  ; 

the  assembler  prog  ram:  if  reguired,  as  a  deck  of  cards; 

load  card:  to  invoke  the  (usually  system-provided) 

loader ; 

ob  ject  modules:  program  portions  previously  compiled  for 
inclusion  in  this  program  (reusable  subroutines,  for 
example) ; 

data  card:  to  tell  the  system  that  input  data  followed; 
input  data  cards : 

end  card:  to  signify  the  end  of  the  data  (and  the  job) 

[Ref.  1:  p.  200]. 

After  preparing  the  card  deck,  the  programmer  fed  the 
deck  into  the  card  reader  and  waited  for  his  output,  which 
was  usually  produced  on  the  line  printer.  If  the  program 
contained  a  compilation  error,  the  programmer  had  to  deter¬ 
mine  the  cause  of  error  (usually  with  the  help  of  a  diag¬ 
nostic  message  of  questionable  utility)  ,  edit  his  program  by 
typing  new  cards  to  replace  the  erroneous  ones,  and  resubmit 
his  deck  through  the  card  reader.  Even  if  the  program 
compiled  successfully,  it  might  have  been  aborted  during 
execution  because  of  a  run-time  error,  again  necessitating 
the  error  detection  and  correction  procedures  previously 
mentioned.  A  third  error  type  that  occurred  was  the  logic 
error  that  compiled  and  executed  but  produced  incorrect 
output.  After  checking  the  output  and  determining  it  was 
incorrect,  the  programmer  again  had  to  determine  the  cause 
of  error  (but  this  time  without  any  diagnostic  aid  from  the 
system)  and  repair  the  program. 
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Even  if  the  expense  of  computer  trine  were  not  a  factor 
{which  it  was  in  that  era),  one  can  easily  understand  the 
other  reason  for  program  drafting  and  desk-checking:  to 

lessen  the  personal  frustration  of  program  correction  and 
resubmission  [Ref.  2:  p.  445  ].  Clearly  the  programming 

environment  was  not  conducive  to  program  development.  It 
forced  the  programmer  to  concern  himself  with  satisfying  its 


requirements,  avoiding  abort 
number  of  job  submissions,  w 
have  been  the  problem  he  was  o 

It  is  certainly  true  that 
time  played  a  role  in  causi 
programming  environment.  How 
cheaper  computers  and  interact 
gent  terminals  in  the  1970's 
the  above  process  with  the 
editor,  editing  a  program,  sa 
editor,  invoking  the  compiler 
the  editor  to  effect  changes, 
been  directed  toward  taking  be 
bilities  to  provide  a  use 
environment. 

Because  programming  envi 
research  topic  in  computer  sci 
as  to  what  a  modern  environmen 
is  safe  to  say,  however,  tha 
simply  correct  the  obvious  de 
as  discussed  above.  Sandewal 
example,  presented  a  list  of  s 
of  a  programming  environment, 
of  program  modules,  test  cases 
lect  translation;  compatibil 
segments;  support  for  a  part 
(such  as  top-down  design); 
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ed  runs,  and  minimizing  the 
hile  his  prime  concern  should 
riginally  trying  to  solve, 
the  hardware  technology  of  the 
ng  the  unfriendliness  of  the 
ever,  the  arrival  of  faster, 
ive  time-sharing  with  intelli- 
did  little  more  than  replace 
tedious  cycle  of  invoking  an 
ving  the  program,  exiting  the 
,  debugging,  and  re-invoking 
Only  recently  has  attention 
tter  advantage  of  modern  capa- 
ful,  productive  programming 

rcnments  are  such  a  current 
ence,  there  are  many  opinions 
t  should  do  for  the  user.  It 
t  it  should  do  much  more  than 
ficiencies  of  previous  systems 
1  [Ref.  3:  pp.  35-36],  for 
ome  of  the  desirable  functions 
which  included  administration 
,  and  documentation;  interdia- 
i  ty  checking  between  program 
icular  development  methodology 
enhanced  support  of  the 
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create  a  portion  of  the  tree  (by  building  from  a  nonter¬ 
minal  leaf  in  an  incomplete  program  tree)  ; 

move  a  portion  of  a  program  tree  from  one  location  to 
another; 

insert  a  subtree  into  a  seguence  of  like  subtrees. 

The  above  capabilities  describe  what  is  performed  on  the 
tree.  Viewed  in  terms  of  their  textual  effects,  these  func¬ 
tions  enable  the  user  to: 

delete  a  user-defined  name,  a  statement  segment,  an  entire 
statement,  or  a  block  of  statements  in  a  single  command; 

add  to  the  current  program; 

move  text  from  one  location  tc  another; 

insert  an  item  into  a  seguence  of  like  items. 

Note  that  on  a  more  general  level,  the  SDE  allows  the  user 
to  ADD  or  DELETE  information.  It  does  not,  however,  allow 
the  user  to  directly  CHANGE  information  (for  example, 
through  a  "global  replace"  operation) ,  although  this  func¬ 
tion  may  be  realized  indirectly  through  a  series  of  DELETE 
and  ADD  commands.  (Reasons  for  this  limitation  of  the  SDE 
and  possible  corrective  implementations  are  discussed  iD 
Chapter  5.) 

Deletion  of  a  program  segment  is  simple.  The  user  moves 
the  CP  until  it  references  the  entire  portion  to  be  deleted 
(as  indicated  by  highlighting  with  inverse  video  on  some 
terminals),  then  enters  the  control-L  command  ("erase"). 
That  portion  of  the  text  is  removed  and  replaced  with  the 
name  of  the  nonterminal  node  type  expected  there. 
("<decl>*"  is  an  example  of  such  a  node  type.  Its  presence 
tells  the  user  that  this  portion  of  the  program  is 
incomplete. ) 
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however/  and  can  be  accessed  in  the  same  Banner  as  any  other 
node.  Should  the  user  later  wish  to  insert  declarations  in 
this  block,  he  can  erase  the  "nil"  node  (using  control-!)  , 
at  which  time  the  SDE  will  prompt  him  for  declarations. 


begin 

integer  i 
integer  j 
i  :=  0; 
while  i  < 
begin 

10  do 

j:=  i 

1 :  =  i 
end 

end 

:  i: 

AE  end  sessn 

AP  parent 

AB  chg  depth 
AG  right 

1  dsply  togl 
AL  erase 

M  move  togl 
AH  grab 

Figure  2.4  Editing  State,  CP  =  Declarations 


D.  EDITING  A  PBOGBAH 

Moving  around  in  a  program  tree  may  be  considered  a 
"passive"  activity  in  that  it  has  no  effect  on  the  tree 
itself.  The  following  paragraphs  discuss  those  editing 
commands  which  change  the  program  tree  --  in  other  words, 
the  actual  "editing"  functions  cf  the  SDE. 

The  SDE  is  capable  of  performing  the  following  editing 
functions: 


delete  a  portion  of  the  program  tree; 
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elements  of  the  sequence,  and  is  useful  when  attempting  to 
delete  or  move  the  remaining  elements. 

Returning  to  the  situation  depicted  in  Figure  2.2, 
suppose  the  user  types  "cont rcl-3 "  to  move  the  C?  to  the 
right.  Hhen  the  screen  is  updated,  it  will  appear  as  shown 


Figure  2.3  Editing  State,  CP  =  "begin-end”  Block 


in  Figure  2.3.  The  menu  portion  is  unchanged  except  that 
the  command  to  move  to  the  right  has  been  removed  (indi¬ 
cating  there  are  no  more  brothers  to  the  right)  and  the 
command  to  move  to  the  left  has  been  added  (indicating  the 
previous  location  of  the  CP) .  Typing  "control-T"  at  this 
point  causes  the  strange  display  indicated  in  Figure  2.4. 
The  CP  is  located  at  a  node  in  the  program  tree  that  has  no 
textual  representation  on  the  screen.  In  this  particular 
instance,  the  CP  is  the  "declarations”  portion  of  the 
"begin-end"  block,  which  the  user  chose  to  close  off  in  a 
previous  editing  session.  A  "nil"  node  remains  in  the  tree. 


iv 

w* 

w  % 


v  . 


>*.  * 
V. 


begin 

integer  i 
integer  j 

while* i  <  10  do 


end 


begin 

y.=  i  *  i; 

i:=  i*1 

end 


AE  end  sessn 
AP  parent 
AT  child 


AB  chg  depth 
AG  right 
AA  chg  focus 


|  dsply  togl 
AL  erase 


M  move  togl 
AH  grab 


Figure  2.2  Sample  State  of  Editing  Using  Hinigol  Grama ar 

(Actually,  this  is  not  entirely  true.  When  the  user 
selects  a  command  to  move  the  CP,  the  actual  direction  of 
movement  is  hidden  from  him.  However,  the  apparent  move¬ 
ment,  as  seen  on  the  display,  is  in  the  direction  selected 
by  the  user.  For  a  more  thorough  explanation  of  this  opera¬ 
tion,  see  Section  3.C.) 

As  the  user  moves  the  CP  about  the  program  tree,  the  set 
of  legal  commands  changes.  For  example,  the  command  to  move 
to  a  right  brother  is  offered  only  if  that  brother  exists. 
The  entire  set  of  movement  commands  (each  being  offered  when 
applicable)  includes  "parent"  (to  move  upward  in  the  tree)  , 
"child"  (to  move  downward  in  the  tree)  ,  and  "right"  and 
"left"  (to  move  to  brother  nodes  on  the  same  level  in  the 
tree).  An  additional  command,  "rest  seg,"  applies  only  when 
the  CP  points  to  an  element  of  a  sequence,  such  as  a 
sequence  of  individual  declarations  within  a  block.  This 
command  positions  the  CP  to  reference  all  subsequent 
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Figure  2.1  Initial  Display  for  Mew  Hinigol  Progran 


C.  aOVIMG  ABOUND  IN  A  PhOGBAfl 

The  following  paragraphs  assume  the  current  state  of 
editing  to  be  as  depicted  in  Figure  2.2.  (The  grammar  in 
use  is,  again,  the  "Minigol"  grammar  of  Appendix  E.) 

Figure  2.2  indicates  that  the  Current  Position  is  the 
”i  <  10"  portion  of  the  "while”  statement.  Observe  the 
commands  the  editor  makes  available  to  the  user  when  at  this 
CP.  Setting  aside  the  more  general  commands  for  now,  one 
notes  that  the  user  may  move  the  CP  either  to  the  right,  to 
a  "child,”  or  to  a  "parent."  These  movements  make  sense 
when  one  realizes  that  he  is  moving  through  the  program  tree 
and  not  directly  through  the  text.  Thus,  moving  to  the 
"parent”  shifts  the  CP  to  the  node  in  the  program  tree  whose 
sons  include  the  "i  <  10”  portion  of  the  program;  moving  to 
the  right  shifts  the  CP  to  its  brother  node  to  the  immediate 
right  in  the  tree;  and  moving  to  the  "child"  shifts  the  CP 
to  the  leftmost  son  of  the  (present)  CP. 
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readable  format  to  the  SDE  —  which  is  assured  if  it  was 
written  by  the  SDE  using  the  same  grammar  file  in  the  last 
editing  session.) 

FIXE  ALREADY  EXISTS  (Y  OR  N) ?  (This  question  must  be 
answered  to  prevent  the  SDE  from  attempting  to  access  a 
file  that  doesn*  t  exist.  It  is  a  limitation  of  the  Pascal 
implementation  of  the  SDE.) 

The  present  implementation  allows  all  of  the  above 
information  to  be  provided  on  the  "command  line"  that 
invokes  the  SDE.  Thus  the  same  initialization  could  be 
achieved  by  typing  SDE  PASCAL  DEMO.  P  Y,  for  example. 

(A  third  file,  called  "TEEM",  is  also  accessed  by  the 
SDE  and  must  be  present  in  the  environment.  This  file 
provides  information  to  the  SEE  about  the  display  screen 
being  used.  Details  about  this  file  are  provided  in 
Appendix  F.) 

At  this  point  in  the  session,  the  SDE  has  read  the 
grammar  and  "TEEM"  files  and  has  organized  the  information 
contained  in  them.  If  a  pre-existent  program  file  was  indi¬ 
cated,  this  file  has  also  been  read  and  processed;  the 
program  as  last  edited  appears  on  the  screen.  If  creating  a 
new  program,  a  skeleton  of  a  program  (based  on  the  selected 
grammar)  appears.  In  either  case,  a  menu  of  choices  also 
appears  at  the  bottom  of  the  screen.  The  initial  display  of 
a  new  program  using  the  "Minigcl"  grammar  in  Appendix  E  is 
shown  in  Figure  2.1. 

Note  in  the  figure  that  the  current  focus  of  attention 
(hereafter  called  the  "current  position"  or  CP)  is  indicated 
by  underlining.  On  an  actual  display  screen,  the  current 
position  is  indicated  by  inverse  video  (if  the  terminal 
supports  it)  or  by  any  distinguishing  characters  indicated 
in  the  "TERM"  file. 
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II.  &  SAMPLE  EDITING  SESSION  WITH  THE  SDE 

A.  GENERAL 

The  SDE  was  designed  for  interactive  sessions  using  a 
computer  terminal.  It  displays  edited  programs  in  textual 
form  on  the  screen,  but  creates,  manipulates,  and  stores 
them  as  program  trees.  The  SDE  does  not  hide  this  represen¬ 
tation  from  the  user.  Rather,  many  of  its  commands  are 
worded  to  guide  the  user  through  the  tree  he  is  editing, 
constantly  reminding  him  that  his  real  product  is  something 
other  than  its  textual  representation. 

The  user  interacts  with  the  editor  by  typing  the  desired 
command,  followed  by  a  carriage  return.  He  can  also  enter  a 
secies  of  commands  (up  to  80  characters  in  total  length) 
followed  by  a  single  carriage  return.  Illegal  commands  are 
detected  and  reported  when  entered  individually;  when 
entered  as  part  of  a  string  of  commands,  they  are  reported 
and  the  rest  of  the  string  is  ignored. 

(In  the  paragraphs  that  follow,  terms  such  as 
"control-A"  or  ,,*A"  refer  to  the  consecutive  striking  of  the 
’’control"  and  "a"  keys  on  the  keyboard.) 

B.  INITIALIZING  THE  SDE 

Every  session  with  the  SDE  commences  with  the  SDE 
presenting  a  series  of  questions  to  the  user  as  follows: 

GRAMMAR  FILE:  (Enter  the  name  of  the  file  containing  the 
grammar  the  SDE  will  use  to  parse  and  display  the 
pro.gram.) 

PROGRAM  FILE:  (Enter  the  name  of  the  file  to  be  edited. 
If  editing  an  already-existing  file,  the  file  must  be  in 


the  SDE  to  include  data  structures  used,  display  implementa¬ 
tion,  and  data  storage. 

Chapter  Five  assesses  the  accomplishments  of  the  SDE 
both  in  theory  and  as  a  product.  It  describes  the  SDE*s 
design  decisions  and  may  serve  as  an  " after-action  report" 
on  the  SDE.  Improvements  and  future  development  are  also 
discussed.  Chapter  Five  further  contrasts  syntax- directed 
editors  with  text  editors  and  discusses  the  implications  of 
syntax-directed  editing  in  general. 


data”  [Ref.  13:  p.  129]  in  that  it  can  edit  any  information 
that  can  he  structured  under  a  format  acceptable  for  input. 
The  notion  that  syntax-directed  editing  may  be  applied  to 
structures  other  than  program  trees  is  further  discussed  in 
[Hef.  14]. 

C.  INTRODUCTION  TO  THE  SDE  AND  OVERVIEi  OF  THIS  PAPEB 

With  the  above  discussion  of  modern  programming  environ¬ 
ments  and  syntax-directed  editing  as  background,  this  paper 
will  discuss  the  writer’s  development  of  a  table-driven 
syntax-directed  editor  capable  of  manipulating  information 
structured  under  virtually  any  context-free  grammar.  This 
editor,  hereafter  called  the  SDE,  stores,  retrieves,  and 
edits  tree  structures  based  on  the  rules  presented  in  an 
input  grammar  selected  by  the  user.  Interactive  in  nature, 
it  is  menu-driven  and  terminal-independent.  As  will  be 
seen,  its  manner  of  tree  manipulation  also  gives  it  limited 
language  translation  and  other  desirable  properties. 

The  SDE  was  based  primarily  on  the  work  found  in 
[Bef.  15].  It  was  programmed  in  Pascal  as  compiled  by  the 
Berkeley  compiler,  and  is  currently  in  operation  within  a 
Unix  environment  on  a  VAX  1  1/760  minicomputer.  The  reader 
is  also  invited  to  read  [Ref.  16],  which  presents  an 
in-depth  discussion  of  table-driven  syntax-directed  editing 
and  which  served  as  research  material  both  for  this  paper 
and  for  [Ref.  15]. 

Chapters  Two,  Three,  and  Four  of  this  paper  may  be 
viewed  as  describing  the  SDE  in  progressively  greater  levels 
of  detail.  Chapter  Two  describes  a  sample  session  using  the 
SDE  and  serves  as  an  introduction  to  its  operation.  Chapter 
Three  discusses  the  conceptual  basis  of  the  SDE,  including 
the  algorithms  it  uses  to  display  and  store  information. 
Finally,  Chapter  Four  discusses  detailed  implementation  of 
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navigate  within  a  tree,  as  in  a  syntax-directed  editor,  but 
they  also  allow  the  user  to  enter  text  at  any  stage  in  the 
editing  process.  The  text  is  parsed  by  the  editor,  and  the 
tree  produced  by  the  parse  is  grafted  onto  the  existing 
program  tree.  Another  editor  in  this  category  is  the  "Z" 
editor  at  Yale  University  £Bef.  9].  It  possesses  syntax- 
directed  features  such  as  automatic  indentation,  automatic 
balancing  of  expressions,  user-directed  selection  of  entire 
syntactic  units,  and  an  adjustable  level  of  display  detail. 
However,  since  it  uses  a  text-oriented  model  of  a  program 
rather  than  a  tree  structure,  it  is  more  accurate  to  state 
that  Z  is  a  text  editor  capable  of  simulating  many  of  the 
functions  found  in  a  syntax-directed  editor. 

Whereas  Z  is  a  text  editor  that  simulates  a  syntax- 
directed  editor,  there  also  exist  syntax-directed  editors 
that  manipulate  text.  ED3  £Hef.  10],  for  example,  is  an 
editor  "primarily  designed  for  manipulation  of  hierarchi¬ 
cally  structured  texts."  It  does  so  by  superimposing  a  tree 
structure  onto  the  text,  analogous  to  a  structured  outline 
or  table  of  contents.  The  section  to  be  edited  or  viewed  is 
selected  by  navigating  around  the  tree.  Further  discussion 
of  structured  "document  editors"  may  be  found  in  £Bef.  11] 
and  £Bef.  12]. 

The  MENTOR  system  £Bef.  13]  includes  an  editor  that  can 
accurately  be  called  syntax- directed  because  it  edits 
programs  by  manipulating  abstract  syntax  trees  based  on  the 
grammars  of  programming  languages.  The  system  utilizes  a 
tree  manipulation  language,  MENTOL,  which  includes  primi¬ 
tives  from  which  macros  may  be  created  to  tailor  the  system 
to  edit  programs  in  a  particular  programming  language.  A 
viable  set  of  Pascal  macros  currently  exists.  Note, 
however,  that  because  the  MENTOE  system  may  be  configured  to 
handle  any  of  a  variety  of  languages,  it  is  accurately 
described  as  "a  processor  designed  to  manipulate  structured 
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such  translation  amounts  to  direct  substitution  of  one  set 
of  templates  for  another. 

There  are  several  examples  cf  syntax-directed  editors  in 
widespread  use  today.  One  that  most  nearly  matches  the 
description  above  is  that  found  in  the  Cornell  Program 
Synthesizer  [Bef.  6].  It  uses  templates  for  PL/CS  grammat¬ 
ical  constructs  and  creates  pregrams  top-down  by  inserting 
templates  and  phrases  into  the  existing  templates.  However, 
this  editor  is  more  than  a  simple  syntax-directed  editor 
because  it  insures  a  degree  of  semantic  correctness  as  well. 
For  example,  it  identifies  variables  that  are  referenced  but 
not  declared.  As  part  of  the  overall  programming  environ¬ 
ment,  its  product  is  directly  executable  by  other  tcols. 
Even  when  a  program  has  not  been  fully  created,  it  can  be 
interpreted  up  to  the  point  of  incompletion.  New  code  can 
then  be  entered,  and  interpretation  can  resume. 

Interlisp  [Bef.  5],  also  mentioned  above,  has  an  editor 
that  manipulates  a  program  through  its  syntactic  structure 
rather  than  its  textual  form.  Due  to  the  syntactic 
simplicity  of  LISP,  however,  this  editor  does  not  use 
templates;  virtually  any  combination  of  atoms  and  lists 
comprise  a  syntactically  (if  not  sementically)  correct  LISP 
program.  Originally,  Interlisp’s  editor  was  designed  for 
teletype-style  interaction  and  had  no  full-screen  capa¬ 
bility.  More  recently,  a  display-oriented  editor,  DED,  has 
been  included  to  enhance  Interlisp’s  interactive  nature 
[Bef.  7]. 

One  variation  on  syntax-directed  editing  is  to  combine 
the  gualities  of  syntax-directed  editing  with  text  editing. 
[Bef.  8]  describes  a  family  of  such  editors  produced  from  a 
Hybrid  Editor  Generator,  which  receives  as  input  a  specifi¬ 
cation  for  a  grammar  and  outputs  an  editor  for  that 
language.  These  Automatically  Generated  Editors  allow  the 
user  to  enter  menu  selections  tc  create  program  segments  and 
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B.  SYNTAX-DIRECTED  EDITORS 


A  syntax-directed  editor,  as  its  name  suggests,  is  a 
tool  used  to  edit  programs  based  on  the  syntax  of  an  under¬ 
lying  programming  language.  Typically,  it  utilizes 
templates  of  language  constructs  inside  of  which  the 
programmer  enters  such  items  as  variable  names,  procedure 
calls,  output  strings,  and  sc  on.  The  goal  of  such  an 
editor  is  to  free  the  programmer  from  concern  over  syntactic 
issues.  A  program  constructed  with  a  syntax-directed  editor 
is  assured  to  be  free  of  syntax  errors. 

One  advantage  of  a  syntax-directed  editor  is  that 
overall  development  time  may  he  reduced  by  avoiding  edit 
sessions  whose  sole  purpose  is  to  correct  syntax  errors  for 
subsequent  re-compilation.  If  the  editor  utilizes 
templates,  two  additional  advantages  may  be  realized. 
First,  the  user  need  not  even  learn  the  details  of  the 
language's  syntax  —  he  merely  has  to  know  which  template  to 
install  at  a  given  point  in  the  program.  Second,  selection 
of  templates  with  single  keystrokes  may  reduce  the  time 
spent  in  the  editing  process  itself  when  compared  to  typing 
the  symbols  in  the  constructs  individually  using  a  text 
editor. 

Syntax-directed  editors  typically  represent  the  program 
they  are  editing  as  a  tree  structure,  and  present  a  textual 
image  to  the  user  through  the  templates  he  has  selected.  If 
the  internal  representation  of  the  program  is  in  fact  a 
structure  suitable  for  subsequent  interpretation  or  code 
generation,  then  the  editor  serves  as  a  parser  as  well.  In 
terms  of  the  overall  environment,  this  allows  the  presence 
of  a  much  simpler  {and  faster)  compiler  or  interpreter 
needing  no  scanner,  parser,  or  syntax  error  recovery  mech¬ 
anism.  Some  limited  facility  for  program  translation  from 
one  high-level  language  to  another  may  also  be  possible,  if 


displays  it  in  textual  form.  The  Synthesizer  also  includes 
sophisticated  debugging  aids  that  permit  tracing  the  flow  of 
execution  through  the  program  at  any  user-selected  rate. 
The  user  can  step  the  prc  ram  cne  statement  or  construct  at 
a  time,  and  may  command  the  system  to  display  the  value  of 
particular  variables  as  he  does  so.  (This  is  an  excellent 
example  of  an  environment  freeing  the  user  from  a  tedious 
activity.  Contrast  such  a  feature  with  the  outdated  advice 
given  in  [ Bef .  2:  p.  453  ],  which  states  that  to  trace  a 
program's  progress,  "the  use  of  additional  WHITE  commands  in 
strategic  places  is  the  most  useful  technique. ") 

One  characteristic  of  both  environments  described  above, 
and  of  modern  environments  is  general,  is  the  integration  of 
individual  tools.  The  progress  of  a  particular  tool  is 
shared  with  the  others,  with  the  result  that  the  system  both 
eliminates  duplication  of  effort  and  gains  knowledge  about 
the  program  being  developed.  For  example,  the  syntax- 
directed  editor  of  the  Cornell  Program  Synthesizer  produces 
an  executable  derivation  tree  fcrm  of  the  program  during  the 
editing  session;  using  such  a  structure  as  an  interface 
between  tools,  subsequent  compilation  or  direct  execution 
can  begin  without  the  re-parsing  which  would  have  been 
required  had  a  conventional  text  editor  been  used.  The 
Interlisp  tools  are  often  invoked  from  within  each  other  by 
the  user,  allowing  him  to  consider  program  segments  from 
different  perspectives  without  losing  his  place  in  the 
program.  Sharing  of  program  knowledge  among  the  tools  thus 
can  provide  a  more  responsive  overall  environment.  In  the 
future,  programming  environments  may  even  resemble 
Winograd's  System  A  £Bef.  4],  which  comes  to  "understand”  a 
program  as  it  is  being  developed,  forming  its  own  comments 
and  checking  the  program  against  the  user's  apparent 
intentions. 
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interactive  session  (to  enable,  for  example,  the  programmer 
to  back  up  through  his  commands  and  undo  their  effects) ;  and 
specialized  editing  based  on  the  editor's  knowledge  of  the 
syntax  of  the  programming  language.  Binograd  [Bef.  4:  p. 
14]  envisioned  a  futuristic  environment  as  a  "moderately 
stupid  assistant,  to  whom  we  give  all  the  information  we 
possibly  can,  and  who  in  turn  relieves  us  of  much  of  the 
burden  of  memory,  tedious  checking,  and  drawing  more-or-less 
straightforward  conclusions. "  Based  on  the  above,  it  is 
appropriate  to  summarize  simply  that  a  programming  environ¬ 
ment  should  do  everything  possible  to  facilitate  program 
development. 

There  are  many  "state-of-the-art"  programming  environ¬ 
ments  in  operation,  each  possessing  somewhat  different  capa¬ 
bilities.  Interlisp  [Ref.  5],  for  example,  provides  an 
environment  for  the  development  of  LISP  programs.  During 
interactive  sessions,  the  user  talks  exclusively  to  the 
Interlisp  system.  The  program  being  developed  is  created, 
stored,  and  manipulated  as  a  data  structure  by  the  system's 
structure  editor,  which  displays  the  program  in  textual  form 
for  the  user.  A  facility  called  "Masters cope"  analyzes  and 
cross-references  the  program  to  provide  such  information  as 
which  functions  call  which,  how  and  where  variables  are 
bound,  and  so  on.  Interlisp  also  includes  a  DRIM  ("Do  What 
I  Mean")  facility  which,  upon  error  detection,  attempts  to 
determine  what  the  user  intended  and  automatically  make  the 
necessary  correction. 

Another  example  of  a  modern  programming  environment  is 
the  Cornell  Programming  Synthesizer  for  PL/CS,  a  subset  of 
the  PL/I  language  [Ref.  6].  It  allows  creation  and  editing 
of  programs  through  a  syntax-directed  editor,  which  uses 
templates  based  on  the  language's  grammar  to  insure  the 
syntactic  correctness  of  the  program.  Like  Interlisp,  it 
stores  the  program  internally  in  a  tree  structure  but 
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Adding  to  the  existing  tree  is  dependent  on  one’s  loca¬ 
tion  within  the  tree.  First,  such  <in  operation  is  legal 
only  when  the  CP  references  a  nonterminal  node  as  described 
in  the  previous  paragraph.  When  the  CP  references  a  nonter¬ 
minal  on  the  screen,  it  is  referencing  a  leaf  on  the  (incom¬ 
plete)  program  tree  to  which  sons  must  be  added  to  complete 
the  tree.  Second,  the  nature  of  this  specific  nonterminal 
dictates  the  choice  of  possible  sons  from  which  the  user  may 
select.  Referring  again  to  Figure  2.1,  the  menu  includes 
commands  to  select  an  "integer"  or  "real"  declaration. 
These  options  would  not  have  teen  offered  if  the  CP  were 
referencing  the  "statement"  nonterminal  instead  of  the 
"decl"  nonterminal. 

Assuming  the  user  selected  command  "n"  (for  "integer") 
in  Figure  2.1,  the  display  would  then  resemble  that  of 


begin 

integer  <id>  <decl>*  <statement>; 


end 


'"E  end  sessn 
AP  parent 


AB  chq  depth 
AF  left 


J  dsply  togl 
*X  put 


M  move  togl 
any  char 


Figure  2.5  Program  Display  Vhile  Creating  Declarations 


Figure  2.5.  Note  that  the  word  "integer"  has  been  added  to 
the  display,  the  CP  references  a  new  "var"  nonterminal,  and 
the  menu  choices  reflect  the  new  CP.  The  SDE’s  automatic 


movement  of  the  CP  is  a  feature  optimized  to  permit  the  user 
to  create  his  entire  program  from  top  to  bottom  (of  text) 
without  having  to  move  the  CP  himself. 

Moving  a  tree  segment  from  one  location  to  another  is  a 
two-step  process.  First,  the  user  must  "grab"  the  desired 
portion  of  the  tree  or  text.  This  is  done  by  positioning 
the  CP  on  the  entire  portion  desired,  then  entering 
control-H  for  "grab"  along  with  a  digit  from  0  to  9.  (Note, 
then,  that  the  SDE  can  maintain  up  to  ten  "grabbed"  segments 
at  a  time.)  No  change  occurs  on  the  display,  because  grab¬ 
bing  a  program  segment  does  net  delete  its  present  occur¬ 
rence.  The  "grab"  function  is  therefore  a  "copy"  function 
which  allows  duplication  of  program  segments.  To  delete  the 
original  occurrence,  the  "erase"  command  discussed  above  may 
be  applied  after  the  segment  has  been  grabbed. 

The  second  step  in  moving  a  segment  is  to  place  it  in 
its  new  location.  This  new  location  must  be  a  nonterminal 
leaf  as  described  above.  Further,  the  nonterminal  must  be 
compatible  with  the  root  of  the  program  segment  tc  be 
attached.  Thus,  one  can  not  attach  a  sequence  of  declara¬ 
tions  where  a  sequence  of  statements  is  expected,  nor  can  he 
even  attach  it  where  a  single  declaration  (not  a  sequence) 
is  expected.  The  user  attaches  a  grabbed  program  segment  by 
entering  the  "put"  command  (control-X)  along  with  the  digit 
(0  through  9)  referencing  the  grabbed  segment.  If  the 
segment  is  not  compatible  with  the  CP,  an  error  message  will 
be  displayed  and  the  graft  will  not  take  place. 

The  final  editing  capability  of  the  SDE,  inserting,  is 
accomplished  through  the  contrcl-[  key,  which  invokes  the 
"insert  before"  command.  This  command  is  offered  only  when 
the  CP  references  an  item  in  a  sequence  of  items  in  the 
tree.  A  sequence  is  defined  in  a  grammar  by  the  "+", 
or  "..."  property  of  a  nonterminal  as  displayed  on  the 
screen.  Thus  "<decl>*"  and  "<statement> ; . . . "  both  indicate 
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sequences.  Entering  control-[  allows  the  user  to  enter  an 
item  into  a  sequence  textually  in  front  of  the  item  refer¬ 
enced  by  the  CP.  For  example.  Figure  2.6  shows  the  display 
after  the  user  selected  control-[  when  the  CP  had  indicated 


begin 

integer  i  <decl> 


integer  j 

1  •  “ “  U  2 
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Figure  2.6  Display  After  Selecting  "Insrt  Befr" 

"integer  j".  He  can  now  enter  a  single  declaration  to  be 
inserted  as  indicated,  using  the  normal  creation  commands  in 
the  menu. 

E.  TERMINATING  11  EDITING  SESSION 

The  user  terminates  an  editing  session  by  entering 
control-E.  The  SDE  then  asks  him  two  questions: 

SAVE  PROGRAM  IN  PARSED  FORM  (I  OR  N)  ? 

SAVE  TEXT  FORM  (Y  OR  N)  ? 


The  first  question  above  corresponds  to  the  "save"  or  "quit" 
command  found  in  text  editors,  for  it  enables  the  user 
either  to  save  the  edited  tree  or  discard  it.  If  the  user 
indicates  he  wishes  to  save  the  parsed  form,  it  is  saved 
under  the  same  file  name  entered  during  the  initialization 
process;  thus  the  previous  version  of  the  program,  if  any, 
is  lost.  If  the  user  instructs  the  SDE  not  to  save  the 
parsed  form,  the  previous  version  remains  intact. 

The  second  question  relates  to  the  text  form  of  the 
created  program.  At  the  user's  response  to  this  question, 
the  text  form  of  the  program  may  be  saved  in  a  file  to  be 
named  by  the  user.  Note  that  this  file  is  textual,  and  is 
not  suitable  for  input  to  the  SDE  at  a  later  date.  However, 
it  is  useful  in  that  it  can  be  retained  as  a  text  file  for 
archival  or  inspection  purposes.  Further,  if  complete,  it 
represents  a  syntactically  correct  program  ready  for  input 
to  a  conventional  parser  or  interpreter. 

F.  ADDITIONAL  FEATOBES  OF  THE  SDE 

While  the  above  capabilities  represent  a  functional 
syntax-directed  editor,  the  SDE  contains  several  additional 
features  to  make  it  more  interactive  and  responsive  to  the 
user's  needs.  For  example,  the  "display  toggle"  disables 
the  display  of  the  menu,  allowing  the  user  to  view  more  of 
his  program  on  the  screen.  A  second  entering  of  the  command 
("|")  will  restore  the  menu. 

A  more  significant  display  feature  is  the  combination 
"change  focus"  ana  "change  depth."  The  "focus  node"  is 
defined  as  that  node  in  the  tree  at  which  screen  display 
begins.  When  viewing  the  entire  program,  the  focus  node  is 
thus  the  root  of  .the  tree.  Note  that  the  focus  node  is 
always  the  root  of  a  subtree,  and  only  that  subtree  will  be 
displayed.  The  "depth"  value  indicates  how  many  generations 


of  descendants  from  the  focus  node  are  to  be  displayed. 
Descendants  below  the  depth  limit  will  be  displayed  by  an 
ellipsis  The  combination  of  the  focus  node  and  the 
depth  limit  allows  the  user  to  see  a  detailed  view  of  one 
portion  of  his  program,  or  to  see  an  abbreviated  view  of  his 
entire  program,  on  the  display.  Figure  2.7  represents  a 
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Figure  2.7  Program  Display,  Focus  =  Statements 

display  of  the  program  listed  in  Figure  2.2  when  the  focus 
is  adjusted  to  view  only  the  "statements"  portion  of  the 
program;  the  depth  limit  is  set  sufficiently  high  to  permit 
viewing  of  all  aspects  of  the  statements.  Figure  2.8,  on 
the  other  hand,  represents  a  broader  perspective  of  the  same 
program.  In  this  case,  the  fccus  node  is  the  root  of  the 
tree,  and  the  depth  limit  has  been  set  low. 

A  focus  node  is  selected  by  positioning  the  CP  over  the 
desired  program  segment  to  be  viewed,  then  entering 
control-A  for  the  "chg  focus"  command.  Note  that  such  a 
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Figure  2.8  Prograa  Display  at  Low  Depth  Setting 

process  can  serve  only  to  bring  the  focus  "closer"  to  the 
lowest  program  level.  Elevating  the  focus  to  a  higher 
perspective  is  accomplished  through  the  control-P  ("parent") 
command,  which  automatically  raises  the  focus  as  necessary 
whenever  the  CP  is  moved  upward  in  the  tree.  The  depth 
limit  is  set  by  selecting  control-B  for  "chg  depth," 
followed  by  a  positive  integer. 

The  final  feature  to  be  discussed  is  the  "move  toggle." 
As  mentioned  previously,  the  SDE’s  automatic  movement 
feature  is  optimized  to  permit  top  to  bottom  entering  of 
text  without  manually  moving  the  CP.  This  feature,  however, 
tends  to  act  against  the  user  when  editing  a  pre-established 
program  portion.  To  inhibit  this  feature,  capital  M  can  be 
entered.  All  subseguent  movement  of  the  CP  must  be  directed 
by  the  user  through  the  movement  commands  discussed  in 
Section  2.C. 
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III.  THE  CONCEPTUAL  BASIS  OF  THE  SDE 


A.  PASSING  AND  TRANSLATION  ON  A  CONTEXT-FBEE  GRAMMAR 

The  textual  fora  of  a  computer  program  is  written 
according  to  the  rules  of  the  programming  language's 
grammar/  which  for  mcst  programming  languages  is  classified 
as  being  context-free.  (Current  programming  languages  ofter 
include  features  that  make  them  more  complicated  than 
context-free.  Two  examples  of  such  features  are  the 

requirement  for  variables  to  he  declared  before  they  are 
used  and  the  requirement  that  procedures  be  declared  and 
invoked  with  the  same  number  cf  parameters  [Ref.  17:  p. 

140],  Languages  with  such  features,  however,  are  still 
considered  and  treated  as  context-free,  with  their  special 
cases  handled  as  exceptions  on  individual  bases  [Ref.  18:  p. 
26].  )  Formally,  a  context-free  grammar  is  a  four-tuple 
(N,  E,  P,  S) ,  where  N  is  a  nonterminal  alphabet,  E  is  a 
terminal  alphabet,  E  and  N  are  disjoint,  S  is  the  "start 
symbol"  and  an  element  of  N,  and  P  is  a  set  of  productions 
of  the  form  A  -->  x  such  that  in  each  production: 

A  is  an  element  of  N; 

x  is  a  string  formed  by  combining  any  finite  number 
(including  zero)  of  elements  from  N  and  E. 

The  set  of  terminal  strings  that  can  be  formed  by 
applying  the  rules  of  the  grammar  is,  in  effect,  the  set  of 
programs  that  can  be  written  using  that  grammar.  A  parser 
is  a  program  that,  given  a  string  of  terminals,  determines 
if  it  is  a  legal  program  from  the  language's  grammar.  The 
parser  does  this  by  finding  the  derivation  (the  sequence  of 
applications  of  the  grammar's  rules)  that  would  produce  the 
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terminal  string.  This  derivation  may  be  represented  (and  is 
usually  perceived)  as  a  "derivation  tree"  whose  root  is  the 
nonterminal  symbol  S  and  whose  leaves  are  the  terminals  that 
make  up  the  program.  For  example,  consider  the  following 
context-free  grammar: 

N  =  {A,  T,  F)  ; 

E  =  {+,  *,  (,  )  .  9,  r,  s)  ; 

S  =  A; 

P  =  the  set  of  productions 
A  — >  A  +  T 
A  -->  T 
T  — >  T  *  F 
T  — >  F 
F  — >  (A) 

F  — >  q 
F  — >  r 
F  — >  s 

The  derivation  tree  of  the  legal  program  "g  *  (r  ♦  s)"  is 
shown  in  Figure  3.1. 

While  the  parser's  function  is,  by  definition,  the 
determination  of  the  syntactic  correctness  of  a  program,  the 
derivation  tree  it  creates  is  also  an  important  product  in 
itself.  In  order  for  a  computer  to  execute  the  original 
program,  the  program  must  be  translated  into  a  more  suitable 
form  known  as  intermediate  code,  which  can  either  be  inter¬ 
preted  directly  or  optimized  and  translated  again  into 
machine-executable  code  (compiled).  Program  translation  is 
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Figure  3-  1  Derivation  Tree  of  "g  *  (r  ♦  s)  " 

accomplished  through  a  ‘’Syntax-Directed  Translation  Scheme," 
or  SDTS  [Hef.  18:  p.  279],  which  conceptually  transforms  the 
derivation  tree  into  a  "translation  tree"  by: 

1)  removing  the  terminal  nodes; 

2)  permuting  the  children  of  each  interior  node  according 
to  a  particular  translation  rule; 

3)  adding  new  terminal  nodes,  members  of  a  new  terminal 
set . 

Formally,  an  SDTS  is  defined  as  a  five-tuple 
(N,  E,  D,  E,  S)  ,  where  N,  E,  and  S  are  the  same  as  above,  D 
is  the  terminal  alphabet  of  the  translation,  and  R  is  a  set 
of  productions  A  -->  x, y  such  that  in  each  production: 

A  is  an  element  of  N; 
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x  is  a  string  of  terminals  from  E  and  nonterminals  from  N 

(as  above) ; 

y  is  a  string  of  terminals  from  D  and  nonterminals  from  N; 

there  is  a  one-to-one  association  of  nonterminals  in  x  and 

y- 

Note  that  by  following  an  SDIS,  two  trees  may  be 
constructed-  The  first  is  the  derivation  tree,  produced 
from  the  "A  — >  x"  portion  of  the  productions.  The  second 
tree  may  be  created  from  the  "A  — >  y"  portion  of  the 
productions.  This  second  tree  is  the  translation  tree,  and 
constructing  it  in  parallel  with  the  derivation  tree  accom¬ 
plishes  the  three  conceptual  transformations  listed  above 
[Ref.  18:  p.  296].  In  fact,  it  is  the  translation  tree,  not 
the  derivation  tree,  which  is  the  desired  by-product  of  the 
parser,  for  it  is  a  representation  of  the  program's  interme¬ 
diate  form. 

As  an  example  of  program  translation,  consider  the 
following  SDTS,  which  is  an  extension  of  the  context-free 
grammar  described  earlier: 

N  =  {A,  T,  F]  ; 

E  =  (♦,  *,  (,  )  ,  g,  r,  s}  ; 

D  =  {ADD,  MPY,  g,  r,  s}  ; 

S  =  A; 

R  =  the  set  of  productions 
A  — >  A  ♦  T,  AT  ADD 

A  — >  T,  T 

T  — >  T  *  F,  T  F  MPY 

T  -->  F,  F 
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F  — >  (A)  ,  A 

F  —  >  q,  q 

F  — >  r,  r 

F  — >  s,  s 


The  translation  tree  of  the  program  "g  *  (r  +  s)  "  is  shown 
beside  the  program's  derivation  tree  in  Figure  3.2.  Using 
this  SDTS,  the  translation  cf  the  original  program  is 
"q  r  s  ADD  MPY"  (which  is  the  same  expression  in  postfix,  or 
postfix  Polish,  notation). 


One  should  note  that  parsers  seldom  actually  construct 
the  derivation  or  translation  tree  as  conceptualized  above. 
Some  more  efficient  representation,  often  involving  a  stack, 
is  frequently  used  instead  [Bef.  18:  p.  46], 
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B.  THE  SDE  AS  PARSES  AND  TRANSIATOE 


As  stated  above,  two  major  functions  of  a  parser  are  to 
determine  syntactic  correctness  and  translate  the  source 
program  into  intermediate  code.  The  SDE  also  performs  these 
same  functions,  although  in  a  different  manner.  Syntactic 
correctness  is  assured  because  the  editor  only  creates 
correct  programs.  Translation  is  accomplished  by  dynami¬ 
cally  creating  the  translation  tree  during  the  editing 
session. 

The  SDE  creates  and  edits  programs  based  on  an  input 
grammar  of  the  user's  choice.  The  grammar  file  represents 
productions  in  a  manner  consistent  with  the  above  discus¬ 
sion:  each  production  is  of  the  form  A  — >  x,y  where  x  and  y 
are  as  described  above.  The  SDE,  however,  uses  the  elements 
of  x  and  y  in  a  manner  different  from  that  of  the  SDT3.  In 
the  SDE,  the  "x"  portion  of  a  production  (hereafter  referred 
to  as  the  "analysis  part")  determines  the  textual  form  of 
that  production  in  the  program  as  displayed  to  the  user 
during  the  interactive  session.  The  analysis  part  of  a  rule 
acts  as  a  template  which  displays  the  terminals  in  a  rule 
and  treats  nonterminals  as  "hcles"  to  be  filled  in  using 
other  rules.  The  "y"  portion  cf  a  rule  (hereafter  referred 
to  as  the  "synthesis  part")  determines  what  will  be  added  to 
the  tree  being  created  by  the  SDE  when  that  rule  is  selected 
by  the  user  during  the  editing  process. 

Intuitively,  the  SDE  only  creates  syntactically  correct 
programs  because  the  terminals  are  written  by  the  editor, 
not  by  the  programmer/user.  Whereas  a  conventional  parser 
uses  grammar  rules  to  determine  whether  a  given  input  string 
of  terminals  is  correct,  the  SDE  uses  the  same  grammar  rules 
to  create  the  correct  input  string.  For  example,  based  on 
the  sample  grammar  above,  the  string  "g  *  r  +  s)  "  is  illegal 
because  of  unbalanced  parentheses.  A  programmer  could 
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erroneously  enter  such  an  expression  using  a  text  editor, 
and  a  parser  would  detect  the  error.  If  the  programmer  had 
used  the  SDE  to  create  the  program,  however,  the  parentheses 
would  have  been  balanced  automatically  when  he  selected  the 
rules  in  the  grammar  to  produce  the  string  he  wanted.  Note 
that  neither  of  the  parentheses,  nor  the  or  "+M  opera¬ 
tors,  would  have  actually  been  typed  by  the  programmer  at 
all.  The  SDE  would  have  displayed  these  terminals  as  parts 
of  the  templates  selected  by  the  user. 

?fhen  a  user  creates  a  program  using  the  SDE,  he  selects 
rules  to  be  applied  to  replace  the  nonterminal  leaves 
currently  in  his  unfinished  tree.  As  he  selects  each  rule, 
he  instructs  the  SDE  to  build  onto  the  translation  tree 
according  to  that  rule's  synthesis  part.  What  he  sees  on 
his  display,  however,  is  determined  by  the  analysis  part  of 
the  rule.  Thus,  the  programmer  dynamically  creates  the 
translation  tree  during  the  editing  session,  but  need  only 
concern  himself  with  the  textual  form  on  the  display.  Any 
translation  that  takes  place  is  hidden  from  the  user. 

Actually,  the  claim  that  the  SDE  creates  a  translation 
tree  is  not  entirely  accurate  for  several  reasons.  First, 
an  SDTS  allows  the  translation  of  terminals  from  E  (the 
source  language  alphabet)  into  other  terminals  from  D  (the 
translation  alphabet).  While  ir  the  example  given  above  the 
user-defined  names  g,  r,  and  s  translated  into  themselves, 
the  technical  definition  of  an  SDTS  does  not  reguire  this, 
so  they  could  have  been  translated  into  any  terminals  in  D. 
The  SDE,  however,  stores  the  actual  user-defined  names  from 
E  in  the  tree  it  creates;  no  translation  is  performed  on 
them.  (The  SDE  is  capable  of  performing  such  translation, 
but  its  implementation  is  inefficient  and  its  format  proves 
exceptionally  tedious  to  the  grammar  writer.) 

Another  reason  it  is  inaccurate  to  claim  the  SDE  creates 
a  translation  tree  is  because  the  SDE  is  in  fact  a  general 
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purpose  structure  editor  and  not  simply  a  program  editor. 
The  term  "translation  tree"  implies  that  the  tree  contains 
information  to  be  used  further  in  a  compilation  or  interpre¬ 
tation  process.  While  the  SDE  can  certainly  be  used  to  form 
such  a  tree,  it  is  not  limited  to  these  applications.  The 
purpose  and  structure  of  the  tree  produced  by  the  SDE  depend 
on  the  intention  of  the  input  grammar  designer  --  there 
might  be  no  "translation"  involved.  (Chapter  5  provides  a 
thorough  discussion  of  the  range  of  applications  of  the 
SDE. ) 

Finally,  the  translation  ability  of  the  SDE  is  somewhat 
limited.  £Hef.  17]  states  that  the  intermediate  code 
produced  from  a  practical  SDTS  is  usually  classified  into 
one  of  four  categories:  postfix,  abstract  syntax  tree,  quad¬ 
ruple,  or  triple  notation.  A  simple  example  of  postfix  (or 
postfix  Polish)  notation  has  already  been  provided.  The  SDE 
can  provide  such  a  translation  --  Appendix  D,  for  example, 
lists  an  SDE  input  grammar  representing  the  SDTS  used 
earlier  in  this  chapter.  Abstract  syntax  trees  are  simpli¬ 
fied  derivation  trees  in  which  the  interior  nodes  are  opera¬ 
tors  and  the  leaves  are  operands.  The  SDE  generally  can  not 
create  a  tree  simplified  to  this  extent  because  certain 
nodes  having  no  semantic  value  need  to  be  retained  for  the 
display  information  in  their  analysis  parts.  (An  in*-  r- 
preter  or  code  generator  using  such  an  intermediate  form 
would  have  to  be  tolerant  of  these  useless  nodes.)  Triple 
and  guadruple  notation  are  discussed  in  [Ref.  17].  The 
possibility  of  translation  to  these  forms  was  not  explored 
in  preparing  this  paper. 

Based  on  the  above  considerations,  it  is  more  accurate 
to  say  that  the  SDE  creates  a  tree  which  may  possibly 
achieve  a  translation  of  the  source  program  into  an  interme¬ 
diate  form  useful  to  an  interpreter  or  code  generator.  It 
is  always  true,  however,  that  the  SDE  insures  syntactically 
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IV.  I MPLEMEHT ATI CN  OF  THE  SDE 


A.  GENERAL 

The  following  sections  present  various  aspects  of  the 
SDE's  implementation.  First  a  general  description  of  the 
SDE's  manner  of  interaction  witn  the  user  is  provided, 
followed  by  a  description  of  some  of  the  SDE’s  data  types 
and  primitive  operations  that  act  on  or  use  these  data 
types.  Each  of  the  SDE's  major  processes  are  then  presented 
in  detail.  Finally,  the  SDE's  program  storage  and  retrieval 
functions  are  discussed. 

In  studying  the  following  sections,  the  reader  will 
wonder  why  certain  decisions  were  made  concerning  the  SDE's 
implementation.  Rather  than  discuss  each  such  decision, 
this  chapter  will  present  the  SDE  as  it  exists,  explaining 
trade-offs  and  decisions  only  where  beneficial  to  that  pres¬ 
entation.  Chapter  Five,  which  represents  an  assessment  of 
the  SDE,  will  identify  strengths  and  weaknesses  of  the 
implementation  and  will  suggest  improvements  where  needed. 

As  a  general  comment,  however,  it  should  be  noted  that 
the  SDE  is  more  of  a  prototype  than  a  finished  product.  As 
such,  it  contains  structures  and  procedures  that  are  less 
than  optimal  in  terms  of  efficiency.  Some,  such  as  the  use 
of  a  linked  list  to  house  the  grammar  rules,  were  used 
because  they  were  logically  straightforward  or  easy  to 
implement  in  Pascal.  Much  of  the  implementation,  however, 
exists  because  the  SDE  was  constructed  based  on  the  concepts 
presented  in  [Ref.  15],  and  the  program  inherited  some  of 
the  conventions  of  that  paper.  For  example,  the  input 
grammar  format  utilizes  parentheses  to  delimit  alternations 
and  analysis  and  synthesis  parts  of  rules,  reflecting  the 
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Right/left:  Access  the  syntax  reference  in  the  c II  node  and 
find  the  CP’s  path  name  in  its  nonterminal  dictionary. 
Retrieve  the  preceeding  {for  left)  or  succeeding  (for 
right)  path  name  from  the  dictionary,  as  appropriate. 
Traverse  the  CN's  childlist  until  the  corresponding  path 
is  found,  then  set  the  CP  to  reference  this  path. 

"Rest  Seq"  is  a  movement  command  available  only  when  the 
CP  references  an  item  in  a  sequence  of  syntactic  constructs. 
Such  a  sequence  is  implemented  in  a  particular  way,  as 
described  in  Chapter  4,  and  the  "Rest  Seq"  command  utilizes 
that  implementation.  Discussion  of  this  command  is  there¬ 
fore  postponed  until  the  following  chapter. 


will  lead  to  a  different  nonterminal  node  than  originally 
anticipated.)  Note  that  the  right  side  of  the  production 
must  contain  only  a  single  nonterminal.  If  it  were  to 
include  any  terminal  information,  such  information  would  be 
lost  during  unparsing,  because  the  node's  syntax  attribute 
would  reference  a  different  rule. 

The  final  matter  to  be  resolved  is  how  a  user  navigates 
through  the  tree.  Chapter  2  defined  the  five  legal  moves  as 
Parent,  Child,  Right,  Left,  and  Rest  of  Sequence.  Note  that 
in  all  cases,  the  user  may  only  move  to  a  nonterminal  node. 
(User-defined  terminals  are  accessed  indirectly  by  moving 
the  CP  to  the  parent  of  the  terminal,  which  is  a  grammar 
nonterminal.)  "Right"  and  "left"  refer  to  nonterminal 
brothers  as  indicated  by  a  rule's  analysis  part,  so  that  the 
user  may  give  commands  based  on  what  he  sees  on  the  screen. 
(Recall  from  Section  3.  C  that  a  "right"  brother  in  the  anal¬ 
ysis  part  of  a  rule  may  actually  be  constructed  as  a  "left" 
brother  as  dictated  by  the  synthesis  part.)  Further, 
"child"  selects  the  first  nonterminal  son  of  a  node  (again, 
"first"  being  determined  by  the  rule's  analysis  part)  . 
Implementation  of  these  commands  is  facilitated  by  the 
structure  of  the  tree  nodes,  the  creation  of  the  nonterminal 
dictionary  for  each  grammar  rule,  and  the  use  of  the  CN  and 
C?  to  indicate  one's  current  position  in  the  tree.  The 
commands  are  implemented  as  follows: 

Parent:  Access  the  CN's  "parent"  reference.  This  refer¬ 
ence  becomes  the  new  CN  value,  and  the  CP  is  set  tc  the 
first  nonterminal  child  of  the  new  CN; 

Child:  Access  the  node  at  the  end  of  the  CP  and  set  the  CN 
to  reference  this  node.  Set  the  CP  to  the  path  indicated 
by  the  first  entry  in  the  nonterminal  dictionary  for  the 
new  CN ; 
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5)  If  the  rule  is  not  an  alternation,  follow  Step  4  in  the 
earlier  algorithm,  change  the  CN  to  reference  the  node 
just  created,  change  the  CP  to  reference  the  path  to  the 
new  CN's  first  nonterminal  son,  and  repeat  this  algorithm 
from  Step  1. 

Note  that  Step  5  in  the  above  algorithm  requires  finding  the 
first  nonterminal  son  of  the  (new)  CN.  What  if  this  node 
has  no  nonterminal  sons?  Intuitively,  such  a  node  should 
have  been  created  using  an  alternation  rule,  and  thus  Step 
4,  not  5,  should  be  executed.  The  reasoning  behind  this 
statement  is  that  unless  the  user  had  the  option  to  select 
the  particular  set  of  terminal  sons  from  other  choices,  the 
nonterminal  that  produced  these  terminal  sons  is,  in 
reality,  simply  an  abbreviation  for  that  sequence  of 
terminal  sons  elsewhere  in  the  grammar;  thus  an  equivalent 
grammar  may  be  designed  that  removes  these  deviant  cases. 
Appendix  C  provides  rules  for  grammar  design  which  result  in 
the  development  of  input  grammars  which  avoid  such  problems. 

One  additional  feature  of  the  SDE  should  be  discussed 
here;  it  is  possible  to  specify  a  rule  with  no  synthesis 
part  at  all.  Such  a  rule  is  called  an  "identity  rule" 
[Bef.  15;  p.  22],  and  is  of  the  form  A  — >  B  where  B  is  a 
single  nonterminal.  The  advantage  of  such  a  rule  is  the 
creation  of  a  more  compact  tree  without  loss  of  semantic 
information,  which  in  turn  implies  less  memory  requirements 
and  quicker  unparsing  by  the  SEE  as  well  as  quicker  inter¬ 
pretation  or  code  generation  after  editing,  whenever  such  a 
production  is  encountered  in  the  above  creation  algorithm, 
the  nonterminal  on  the  left  side  is  replaced  by  that  on  the 
right  and  the  rule  for  this  nonterminal  is  used  instead. 
(This  is  one  reason  the  term  "expected  node"  was  used  in 
describing  what  a  path  should  lead  to.  If  the  rule  stemming 
from  an  expected  nonterminal  is  an  identity,  then  the  path 


to  te  created  is  the  node’s  childlist,  which  will  include 
the  identities  of  the  paths  indicated  in  the  synthesis 
part.  Further#  if  any  son  is  identified  in  the  synthesis 
part  as  being  a  terminal#  this  son  is  also  created;  it 
will  have  no  syntax  part  or  childlist  values. 

Note  that  the  above  algorithm  makes  no  use  of  the  user's 
input.  In  fact#  if  a  grammar  having  only  one  possible 
production  from  each  nonterminal  were  input,  user  interac¬ 
tion  would  be  totally  unnecessary  —  but  the  grammar  could 
only  create  one  program.  Any  useful  grammar  involves  selec¬ 
tions  from  alternative  productions  to  be  applied  at  partic¬ 
ular  locations  in  the  tree.  In  the  sample  grammar  in 
Section  3. A,  for  instance,  application  of  the  T  — >  F 
production  results  in  a  different  expression  than  if  the 
T  — >  T  *  F  production  were  selected.  Note  that  whereas  a 
conventional  parser  selects  the  appropriate  production  based 
on  the  input  text#  the  SDE  must  allow  the  user  to  direct  the 
selection  of  a  production  as  the  program  is  being  created. 
This  is  accomplished  through  the  alternation  rules  in  the 
grammar.  The  alternation  is  the  only  rule  that  relies  on 
the  user's  input  at  all  —  all  productions  with  pre¬ 
determined  synthesis  actions  can  be  performed  automatically. 
This  fact  can  be  utilized  to  fcrm  a  new  creation  algorithm 
as  follows: 

1)  same  as  above; 

2)  same  as  above; 

3)  same  as  above; 

4)  If  the  rule  just  found  is  an  alternation,  match  the 
user's  command  to  the  appropriate  choice.  Proceed  with 
Step  4  as  above  and  exit  the  algorithm; 
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The  key  to  program  creation  lies  in  the  algorithm  to 
effect  the  special  commands,  and  to  understand  this  algo¬ 
rithm  it  is  first  necessary  tc  introduce  the  mechanism  by 
which  the  SDE  identifies  its  '‘place"  in  the  tree.  The 
current  location  is  determined  through  a  pair  of  values 
called  "CN"  (current  node)  and  "CP"  (current  path)  .  The  CP 
always  references  a  path  from  the  CN  to  one  of  its  nonter¬ 
minal  sons.  Viewed  from  the  user’s  perspective  as  presented 
in  Chapter  2,  the  CN  always  references  the  parent  of  the 
node  of  concern  —  which  is  at  the  end  of  the  CP.  (This  is 
why  "C?"  was  defined  as  "Current  Position"  in  Chapter  2. 
The  SDE  highlights  whatever  is  pointed  to  by  the  Current 
Path,  through  inverse  video  or  some  other  terminal-dependent 
feature.  Since  this  is  the  only  visual  indication  to  the 
user  of  his  place  in  the  program,  the  "Current  Path"  may 
rightly  be  called  the  "Current  Position"  as  well.) 

The  CN  will  always  reference  a  node  currently  in  the 
tree,  while  the  path  indicated  by  the  CP  may  or  may  not  have 
a  node  at  its  end.  Note  that  special  commands  are  only 
valid  if  the  CP’s  path  has  no  node  at  its  end  —  in  other 
words,  something  can  be  added  but  not  overwritten. 
Recalling  that  the  synthesis  part  of  a  grammar  rule  names  a 
node  to  be  created  and  an  ordered  list  of  uniquely  identifi¬ 
able  paths  to  sons,  the  kernel  of  the  SDE's  creation  algo¬ 
rithm  is  as  follows: 

1)  Access  the  grammar  rule  that  created  the  CN  node 
through  its  "syntax"  attribute. 

2)  Search  the  nonterminal  dictionary  of  this  rule  to  find 
the  name  of  the  nonterminal  associated  with  the  CP. 

3)  look  up  the  grammar  rule  for  this  nonterminal. 

4)  Using  the  synthesis  part  of  the  rule  just  found,  create 
a  new  node  at  the  end  of  the  CP’s  path  in  the  tree.  Also 


48 


abbreviations  ("*",  "♦",  and  sc  on)  have  been  omitted,  as 
have  display  considerations  such  as  indenting,  depth 
handling,  and  so  on.  A  more  thorough  explanation  is 
provided  in  Chapter  4. 

The  conversion  from  tree  to  display  has  been  discussed. 
What  remains  is  to  discover  how  the  user  dynamically  creates 
the  tree,  a  process  which  shall  be  referred  to  as  "parsing" 
since  this  is  what  a  conventional  parser  would  do  given  a 
text  input. 

There  are  two  general  categories  of  user  input  to  the 
SDE.  One  category  is  the  set  of  language- in  dependent  or 
"standard"  commands  which  the  user  may  invoke  either  to  move 
about  in  the  tree  or  to  adjust  a  part  of  the  tree  already 
created.  They  are  standard  in  the  sense  that  (as  a  set) 
they  are  legal  at  virtually  any  phase  of  program  development 
or  position  within  the  tree.  Examples  of  standard  commands 
are  Hove  Right,  Delete,  Grab,  and  so  on. 

The  second  category  of  input  is  the  set  of  language- 
dependent,  "special"  editing  ccmmands  which  cause  creation 
of  new  nodes  in  the  tree.  They  are  special  in  that  their 
appropriateness  is  strictly  dependent  on  one’s  location  in 
the  existing  tree  as  related  to  the  input  grammar.  For 
example,  commands  to  create  an  assignment  statement  are  not 
valid  when  positioned  in  a  "declarations  part"  of  the  tree. 

While  standard  ccmmands  have  been  defined  as  being  legal 
at  any  location  in  the  tree,  note  that  certain  members  of 
this  set  will  be  invalid  on  certain  occasions.  For  example, 
it  is  illegal  to  Move  Right  to  the  next  son  of  a  node  if  it 
has  no  more  sons  there.  Similarly,  it  is  illegal  to  access 
the  parent  of  the  root  node  or  descend  to  the  son  of  a  leaf. 
Such  exceptions,  however,  have  nothing  to  do  with  the  input 
grammar  being  used  —  they  are  purely  functions  of  one’s 
location  in  the  tree. 
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3)  terminals  from  the  source  language  (representing  user- 
defined  names)  :  these  nodes  are  similar  to  terminals  of 
the  translation,  but  they  contain  user-provided  terminal 
symbols  instead  of  grammar-prescribed  names. 

The  tree,  therefore,  is  a  structure  whose  interior  nodes  are 
nonterminals  and  whose  leaves  are  either  terminals  of  the 
translation  or  user- provided  terminal  names. 

The  displaying  of  the  textual  form  of  a  program  based  on 
its  parsed  tree  form  is  known  as  "unpar sing, "  and  may  be 
represented  in  a  recursive  algorithm  as  follows.  Assume  the 
existence  of  a  tree  as  described  above.  To  unparse  such  a 
tree,  begin  at  the  root  node  and  follow  the  following  steps: 

1)  If  the  node  is  a  terminal  of  the  translation,  take  no 
action  and  return  from  the  recursion; 

2)  If  the  node  is  a  user-provided  terminal,  display  the 
terminal  and  return  from  the  recursion; 

3)  If  the  node  is  a  nonterminal,  use  its  "syntax"  refer¬ 
ence  to  access  the  rule  that  generated  the  node.  Access 
that  rule's  analysis  part  and  nonterminal  dictionary. 
Consider  each  item  of  the  analysis  part  in  order: 

a)  If  the  item  is  a  terminal,  display  it; 

b)  If  the  item  is  a  nonterminal,  look  it  up  in  the 
nonterminal  dictionary  to  get  a  path  to  the  appropriate 
son  in  the  tree.  If  there  is  a  node  at  the  end  of  this 
path,  go  to  it  and  unparse  it  recursively.  If  there  is 
no  node  here,  display  the  name  of  the  expected  nonter¬ 
minal  on  the  screen  --  this  indicates  a  program  not 
fully  created. 

The  above  unparsing  algorithm  is  only  a  brief  summary  of 
what  the  SDE  performs.  Details  concerning  the  grammatical 


manipulation  routine.  Also  discussed  in  this  section  is  how 
the  user  moves  about  in  the  tree. 

To  understand  the  display  of  the  tree,  it  is  first 
necessary  to  visualize  the  tree  itself.  Each  node  of  the 
tree  is  a  record  structure  with  the  following  fields: 

name:  the  name  of  that  node  type; 

syntax :  a  reference  to  the  grammar  rule  that  produced  the 
node; 

parent:  a  pointer  to  the  parent  of  that  node  in  the  tree; 

child list :  an  ordered  list  whose  members  point  to  the 
sons  of  the  node  in  the  tree.  Each  such  pointer  is 
uniquely  identified  by  its  "path"  attribute. 

(The  above  description  of  the  childlist  is  the  first  glimpse 
of  the  SDE's  actual  implementation  presented  in  this  paper. 
Whereas  the  nodes  of  a  tree  are  usually  pictured  as  having 
direct  pointers  to  a  possibly  variant  number  of  sons,  the 
Pascal  implementation  of  the  SDE  necessitates  an  expandable 
linked  list  of  pointers.  This  implementation  detail  will 
remain  exposed  throughout  the  following  discussion  to  avoid 
confusion  in  Chapter  4,  when  the  full  implementation  is 
presented. ) 

The  above  record  structure  is  used  for  all  the  nodes  of 
a  tree,  although  these  nodes  may  fall  into  one  of  three 
categories: 

1)  nonterminals:  by  definition,  they  have  syntax  refer¬ 
ences  and  sons  in  the  tree; 

2)  terminals  of  the  translation  (such  as  "ADD") :  these 
nodes  have  no  sons  in  the  tree,  nor  do  their  syntax  parts 
reference  anything; 
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in  a  second  rule  type,  the  “alter nation”  [Ref.  15:  p.  28], 
with  the  following  structure: 

name:  same  as  above; 

series  of: 

choice  id:  the  command  the  user  will  input  to  identify 

this  selection; 

analysis  part:  same  as  above; 
synthesis  part:  same  as  above; 

nonterminal  dictionary:  same  as  above.  Note  that,  as 

above,  the  dictionary  is  generated  by  the  SDE,  not 
provided  by  the  grammar; 

display:  what  the  SDE  will  display  in  the  menu  to 

describe  this  choice. 

The  above  is  more  than  a  simplifying  convention.  It  allows 
the  grammar  designer  to  compose  his  own  "display”  and 
"choice  id"  fields.  It  also  has  important  implications  to 
be  discussed  in  the  following  section. 

It  should  be  noted  that  the  SDE  reguires  a  strict  format 
for  its  input  grammars  which  has  been  avoided  in  the  above 
discussion.  A  thorough  description  of  grammar  input  and 
storage  is  included  in  Chapter  4  and  Appendix  8. 

D.  TREE  CREATION,  DISPLAY,  AND  NAVIGATION 

As  mentioned  previously,  the  user  manipulates  a  tree 
structure  created  by  the  SDE  but  views  the  creation  in 
textual  form.  In  this  section  are  presented  the  two  algo¬ 
rithms  that  correspond  to  tree  manipulation  and  tree 
display.  Tree  display  will  be  discussed  first,  since  it 
introduces  concepts  needed  to  understand  the  tree 


a  production  rule  may  produce  mere  than  one  son  of  the  same 
node  type.  Thus,  productions  of  the  form  A  — >  ABBC  are 
also  permitted.  In  such  a  case,  it  is  the  responsibility  of 
the  nonterminal  dictionary  to  distinguish  between  similar 
nodes  in  a  rule  and  provide  the  correct  path  to  whichever 
node  is  reguested. 

The  notation  provided  thus  far,  and  the  two  capabilities 
listed  above,  are  sufficient  tc  show  that  the  SDE  supports 
the  set  of  all  context-free  grammars  less  those  that  contain 
"e- productions, "  or  productions  of  the  form  "A  — >  e"  where 
"e"  represents  the  null  symbol  (or  put  another  way,  produc¬ 
tions  whose  right  sides  contain  no  terminals  and  no  nonter¬ 
minals)  .  [Ref.  19]  states,  however,  that  a  context-free 
grammar  with  e-productions  is  eguivalent  to  one  without  such 
productions,  except  in  the  case  where  the  null  string  is  a 
member  of  the  set  of  strings  derivable  from  the  grammar. 
Thus,  the  SDE  supports  any  context-free  language  that  does 
not  include  the  "empty  program." 

The  SDE  input  grammar  convention  as  described  above, 
therefore,  is  sufficient  to  handle  all  useful  context-free 
languages.  For  ease  of  grammar  design,  however,  the  SDE 
also  supports  several  common  grammatical  conventions: 

the  Kleene  meaning  zero  or  more  occurrences  of  a 

node; 

the  Kleene  meaning  one  or  more  occurrences  of  a  node; 

the  ellipsis  meaning  one  or  more  occurrences  of  a 

node  separated  by  a  delimiter; 

the  option  {"?") ,  meaning  zero  or  one  occurrence  of  a 
node. 

The  SDE  also  allows  an  abbreviated  format  for  collecting 
productions  with  common  left  sides.  This  convention  results 


2)  path:  opnd2 


expected  node:  T 


The  name  and  analysis  portions  of  the  atove  rule  are 
straightforward.  The  synthesis  part#  when  invoked,  causes  a 
nonterminal  node  named  "A”  to  be  created  in  the  tree.  This 
node  will  be  given  paths  to  three  sons#  and  the  paths  will 
be  given  unique  names  to  distinguish  them  from  each  other. 
The  relationship  between  each  path  name  and  the  expected 
nonterminal  node  at  the  end  of  the  path  is  recorded  in  the 
nonterminal  dictionary.  Paths  to  terminals  are  omitted  from 
the  dictionary. 

The  synthesis  part  of  the  above  rule  deviated  from  the 
translation  in  Section  3.  A  to  demonstrate  that  the  order  of 
nonterminals  in  the  analysis  part  need  not  be  duplicated  in 
the  synthesis  part.  Note  that  the  nonterminal  dictionary 
entries  preserve  the  order  of  the  nonterminals  as  they 
appear  in  the  analysis  part.  Thus#  a  sequential  access  of 
nonterminal  dictionary  entries  will  provide  access  to  the 
nonterminal  sons  in  analysis  part  order#  which  is  therefore 
independent  of  the  order  in  which  they  are  logically  stored 
in  the  tree. 

(Note  that  input  grammar  rules  for  the  editor  in 
[Ref.  16]  need  only  contain  analysis  parts.  This  editor 
generates  synthesis  parts  based  on  an  examination  of  the 
analysis  parts#  which  is  possible  because,  as  mentioned 
previously#  the  editor  creates  a  derivation  tree,  not  a 
translation  tree.) 

The  SDE  permits  a  grammar  rule  whose  left  side  nonter¬ 
minal  also  appears  on  the  right  side  of  the  rule.  In  other 
words#  recursive  productions  of  the  form  A  — >  ABC  are 
permitted.  (Grammars  including  such  rules,  however#  must 
provide  alternative  productions  to  apply  to  end  the  recur¬ 
sion.  This  is  analagous  to  the  requirement  that  a  context- 
free  grammar  produce  only  finite- length  programs.)  Further, 


synthesis  part:  the  name  of  the  node  to  create  in  the 
tree,  with  an  ordered  listing  of  the  sons  of  that  node.  A 
node  may  have  both  nonterminal  sons  and  terminal  sons. 
(Note,  however,  that  terminal  sons  represent  terminals  of 
the  translation,  not  terminals  in  the  textual  program); 

nonterminal  dictionary:  an  index  that  relates  the  nonter¬ 
minals  in  the  analysis  part  to  the  nonterminal  sons  in  the 
synthesis  part. 

The  above  components  may  best  be  understood  through  an 
example.  Consider  the  rule  nA  — >  A  +  T”  from  the  sample 
grammar  in  Section  3. A,  but  assume  its  translation  is  to  be 
MT  A  ADD”  (which  represents  a  reversal  of  the  order  of  the 
operands).  This  rule  must  be  redesigned  for  input  to  the 
SDE  as  follows.  Note  the  use  of  guotes  to  mark  terminals 
and  parentheses  to  mark  nonterminals  in  the  analysis  and 
synthesis  parts: 

name:  A 

analysis  part:  (A)  "  +  11  (T) 
synthesis  part: 


node 

to  be 

created: 

A 

sons 

of  node  are: 

1) 

path: 

opnd2 

expected 

node: 

(T) 

2) 

path: 

opnd  1 

expected 

node: 

(A) 

3) 

path: 

oprtr 

expected 

node: 

"ADD" 

The  SDE  will  accept  this  rule  as  input,  adding  it  to  its 
list  of  grammar  rules  after  creating  for  it  a  nonterminal 
dictionary  containing  the  information: 


1)  path:  opndl 


expected  node:  A 


correct  program  creation.  The  textual  representation  of  the 
tree  it  creates  is  therefore  acceptable  as  error-free  input 
to  a  conventional  parser,  so  the  SDE  is  a  useful  editor 
regardless  of  the  external  value  of  the  tree  itself. 

It  should  be  noted  that  the  SDE’s  translation  facility, 
however  limited,  represents  a  major  difference  between  the 
SDE  and  editors  such  as  that  described  in  [Ref.  16].  These 
editors  create  a  representation  of  a  program’s  derivation 
tree,  not  its  translation  tree.  It  may  informally  be  argued 
that  the  SDE  is  at  least  as  powerful  as  such  editors,  for 
derivation  trees  result  from  using  grammars  whose  synthesis 
parts  simply  reflect  the  nonterminals  in  the  corresponding 
analysis  parts. 

Finally,  note  that  the  use  of  the  term  ’’translation"  in 
this  chapter  refers  to  translation  from  a  high-level 
language  to  an  intermediate  fcrm.  Translation  from  one 
high-level  language  to  another  is  also  possible  using  the 
SDE,  and  this  type  of  translation  will  be  discussed  in 
Chapter  5. 

C.  A  CLOSER  LOOK  AT  IHPOT  GRAHHARS 

As  mentioned  above,  an  input  grammar  to  the  SDE  is  a 
series  of  rules  of  the  form  A  — >  x,y  where  x  is  the  anal¬ 
ysis  part  and  y  is  the  synthesis  part  of  the  rule.  The  SDE 
accepts  these  rules  and  organizes  them  into  a  list  of 
records,  each  of  whose  members  contains  the  following 
fields: 

name:  the  name  of  the  rule  (and  the  nonterminal  being 
replaced)  ; 

analysis  part :  an  ordered  list  of  the  terminals  and 
nonterminals  to  be  displayed; 
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lisp  format  used  extensively  in  that  paper.  Data  types 
which  model  "a-lists"  (or  "association  lists,"  collections 
of  attribute-value  pairs)  and  "tagged  a-lists"  (a-lists  each 
possessing  a  single  "tag”  field  at  the  head  of  the  list) 
also  mimic  structures  found  in  the  original  paper,  and  the 
SDE’s  algorithms  are  similarly  based  on  those  found  in  the 
reference.  Following  the  approach  outlined  in  [Bef.  15] 
facilitated  the  SDE’s  development.  It  is  expected,  however, 
that  subsequent  implementations  will  deviate  more  from  the 
details  of  previous  work  to  produce  faster,  more  efficient 
products. 

B.  IHTEBACTIOM  WITH  THE  OSEB 

Interaction  between  the  user  and  the  SDE  can  be  divided 
into  two  distinct  phases:  display  of  textual  information 
(the  output  of  the  SDE)  and  acceptance  of  the  user’s 
commands  (which  serve  as  input  to  the  SDE)  .  At  the  highest 
conceptual  level,  the  SDE  executes  a  cycle  of  displaying  the 
current  status  of  the  program  and  a  menu  of  appropriate 
commands  based  on  that  status,  accepting  the  user's  command, 
and  implementing  the  com;,  nd.  Interaction  is  thus  a  serial¬ 
ized  process  of  display  and  input.  These  two  functions  are 
discussed  individually  below. 

Display  of  SDE  output  requires  knowledge  of  the  specific 
terminal  type  in  use.  The  SDE  must  know  how  to  activate  and 
cancel  inverse  video,  how  to  position  the  cursor  at  the 
beginning  of  the  menu,  and  how  to  clear  the  screen  and  posi¬ 
tion  the  cursor  at  the  top  of  the  display.  It  must  also 
know  the  number  of  lines  and  columns  in  the  display. 
Because  taking  advantage  of  terminal-specific  algorithms  or 
capabilities  would  limit  the  SDE's  terminal  independence, 
knowledge  of  the  required  information  is  instead  provided 
from  a  user-supplied  external  file  (the  "TERM"  file 
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mentioned  in  Chapter  2)  that  can  be  modified  or  replaced 
whenever  a  different  terminal  is  used.  The  SDE  reads  the 
"TERM"  file  during  initialization  and  forms  four  linked 
lists  of  integers  whose  associated  ASCII  codes  will#  when 
sent  in  succession  to  the  screen,  cause  the  display  to 
perform  the  four  desired  functions.  Thus,  to  refresh  the 
display,  the  SDE  writes  the  "clear  screen"  ASCII  sequence  to 
the  terminal,  followed  by  the  Pascal  "writeln"  commands  that 
display  the  program  text;  if  the  menu  is  to  be  displayed  (as 
determined  by  the  "dsply  togl"  value)  ,  the  "move  cursor" 
sequence  is  then  sent  to  position  the  cursor  on  the  menu 
portion  of  the  screen,  and  additional  "writeln"  commands 
display  the  menu  selections. 

One  advantage  of  this  approach  is  that  any  sequence  can 
be  stored  in  the  "TERM"  file.  Terminals  that  do  not  support 
inverse  video,  for  example,  may  have  printable  characters  in 
their  "TERM"  files  so  that,  when  the  SDE  activates  the 
inverse  video  function,  these  characters  are  displayed 
instead.  Cancelling  the  inverse  video  may  be  done  in  the 
same  manner.  (An  example  of  such  a  display  may  be  seen  in 
Figure  4.1.)  Another  use  of  this  feature  involves  the  "move 
cursor"  sequence  that  positions  the  cursor  for  the  menu 
display.  On  some  terminals,  this  sequence  may  need  to 
include  instructions  to  clear  the  remainder  of  the  screen  in 
order  to  erase  the  previous  menu. 

The  input  of  user  commands  to  the  SDE  is  complicated  by 
the  use  of  Pascal  as  a  programming  language.  It  was 
initially  intended  that  the  SDE  refresh  the  screen  with 
every  character  the  user  input  so  that  he  could  see  the 
immediate  effect  of  his  entry  and  view  a  current,  relevant 
menu.  However,  Pascal  input/output  requires  that  a  carriage 
return  be  entered  in  order  for  prior  entries  to  be  read  by 
the  program.  Since  entering  a  carriage  return  after  each 
command  would  prove  unnecessarily  tedious  to  the  user,  the 
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begin 

->  integer  i 
->  integer  i  <- 
i :  =  0  • 

while’ i  <  10  do 
begin 

j:=  i  *  i ; 
i:=  i  +  1 
end 

end 


Figure  4. 1  Sample  Display  for  Terminal  W/O  Inverse  Video 

SDE  presently  accepts  a  string  of  commands,  up  to  80  charac¬ 
ters  in  total  length,  before  the  carriage  return  need  be 
sent.  The  string's  commands  are  processed  individually  as 
if  entered  separately.  The  list  of  legal  selections  is 
updated  after  each  selection  is  acted  upon,  although  the 
screen  (including  the  menu)  is  updated  only  after  the  final 
command  is  completed. 

C.  DESCBIPTIDI  OF  SDE  DATA  TYPES 

The  following  paragraphs  describe  the  data  types  used  by 
the  SDE  to  store  grammar  and  program  tree  information.  A 
formal  definition  of  these  data  types,  as  expressed  in 
Pascal,  is  presented  in  Appendix  A. 

At  the  outset,  the  SDE's  string  representation  must  be 
explained.  Because  Berkeley  Pascal  string  capabilities  are 
limited,  strings  are  stored  as  records  having  a  "wrd"  field 
containing  the  actual  characters  in  the  string  and  a  "len" 
field  representing  the  length  of  the  string.  Note  also  that 
the  "wrd”  field  utilizes  the  Eerkeley  Pascal  "alfa"  type, 
which  allows  a  maximum  string  length  of  10  characters. 

Grammar  rules  are  stored  in  a  linked  list  of  records, 
each  of  which  contains  the  name  of  the  rule  (the  nonterminal 
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on  the  left  side  of  the  production)  and  a  pointer  to  the 
right  side  of  the  rule.  If  the  rule  is  a  simple  production 
(with  a  single  right  side)  ,  the  pointer  references  a  defini¬ 
tion  containing  analysis,  synthesis,  and  nonterminal 
dictionary  parts.  On  the  other  hand,  if  the  rule  is  an 
alternation  (as  discussed  in  Section  3.  D) ,  the  record  points 
to  a  second  linked  list  whose  members  each  contain  the 
single-character  choice  used  to  select  the  particular 
production,  the  display  string  used  to  represent  the  choice 
in  the  menu,  and  a  pointer  to  the  definition  of  the  produc¬ 
tion.  Figure  4.2  represents  the  first  three  entries  in  the 
linked  list  of  rules  in  the  "Minigol1’  grammar  in  Appendix  E. 
Note  that  all  rules,  simple  or  alternation,  eventually 


Figure  4.2  Storage  of  Buies  in  Hinigol  Grammar 


A  closer  look  at  the  definition  part  of  a  rule  reveals 
that  it  consists  merely  of  pointers  to  analysis,  synthesis, 
and  nonterminal  dictionary  parts.  The  analysis  part  is  a 
linked  list  of  records  representing  terminals  and  nontermi¬ 
nals.  (Note  that  the  "info”  field  for  a  nonterminal  is  the 
name  of  the  nonterminal,  while  the  same  field  in  a  terminal 
record  represents  a  string  of  characters,  possibly  including 
formatting  commands,  to  be  displayed  or  acted  upon  during 
unparsing.)  The  synthesis  part  represents  a  "tagged  a-list" 
(as  defined  in  Section  4. A)  whose  tag  represents  the  node  to 
be  created  in  the  tree  and  whose  attribute- value  pairs 
represent  paths  to  expected  children.  The  nonterminal 
dictionary  is  an  "a-list"  whose  entries  reorganize  the 
information  contained  in  the  analysis  and  synthesis  parts. 

Note  that  all  three  definition  portions  contain  an 
"affix"  field.  This  is  a  record  of  three  characters  used  to 
further  describe  the  particular  terminal  or  nonterminal. 
Nonterminals  may  be  affixed  with  a  Kleene  " Kleene  " +", 
ellipsis  (represented  by  a  single  option  symbol  ("?"), 
or  nothing  (represented  by  a  " S"  because  the  SDE  currently 
requires  a  non-blank  character  to  be  read)  .  If  an  ellipse 
is  indicated,  a  second  character  must  be  affixed  to  desig¬ 
nate  how  to  separate  the  individual  items  in  the  sequence. 
Finally,  if  two  or  mere  of  the  same  nonterminal  appear  in  a 
single  production,  a  third  character  (a  "prime"  mark  or  a 
digit)  must  be  provided  to  distinguish  between  them. 
(Actually,  in  such  a  situation  the  distinguishing  mark  is 
the  first  of  the  three  affixes.)  Terminals  need  no  affixes, 
but  the  field  is  included  for  simplicity  of  typing  and  to 
prevent  accessing  nonexistent  fields.  (Rhile  a  variant 
record  structure  could  have  been  used,  it  would  only  have 
saved  the  three  bytes  of  storage  required  by  the  affixes,  at 
the  expense  of  more  complicated  logic  as  well  as  the  space 
required  for  the  variant  structure  itself.) 


A  significant  feature  of  input  grammars  is  the  use  of 
the  SDE  ’'set"  data  type  to  specify  the  members  of  an  alter¬ 
nation  rule  whose  definitions  contain  only  single-character 
terminals  on  the  right  side  of  the  productions.  For 
example,  the  production  "char  — >  a  J  b  |  c"  can  be  repre¬ 
sented  as  an  SDE  "set"  instead  of  a  linked  list  of  alterna¬ 
tives.  Note  that  definitions  eligible  for  representation  as 
sets  would  have  no  nonterminals  in  their  analysis  parts  and 
childless  tag  fields,  identifying  the  selections  made  by  the 
user,  in  their  synthesis  parts.  To  the  SDE,  such  produc¬ 
tions  perform  no  useful  function  except  to  record  the  user’s 
selections.  A  more  efficient  means  of  storing  such  produc¬ 
tions  in  a  grammar  (as  well  as  an  easier  method  for  the 
grammar  designer  to  write  them)  is  to  list  them  as  the 
(logical)  set  of  all  terminals  derivable  from  the  left-side 
nonterminal.  The  SDE  stores  a  grammar's  sets  in  a  linked 
list  whose  records  each  contain  a  set  name  (the  name  of  the 
left-side  nonterminal)  and  a  (Pascal)  set  of  all  the  deriv¬ 
able  terminals. 

The  above  paragraphs  describe  the  data  types  used  by  the 
SDE  to  store  grammar  information.  The  actual  program  tree 
is  created  and  maintained  using  two  general  record  types 
linked  together  in  a  tree  structure  through  use  of  pointers. 
The  actual  tree  nodes  are  "programnode"  records  having  name, 
syntax,  parent,  and  childlist  fields.  The  children  of  a 
programnode  are  accessed  through  the  childlist  field,  which 
points  to  a  linked  list  of  the  second  record  type,  the 
"childnode."  Each  childnode,  in  turn,  points  to  a  single 
child  (of  type  "programnode")  . 

Each  programnode' s  syntax  field  contains  a  pointer  to  a 
definition  in  the  grammar  linked  list.  Note  that  even  if 
the  node  were  produced  from  an  alternation,  the  syntax  field 
would  reference  the  specific  definition  from  the  list  of 
ai ter natives.  The  parent  field  is  a  pointer  to  the  node's 


which  is  also  of  type  "programnode.  " 


parent  (in  the  tree)  , 

The  childlist  field,  as  previously  mentioned,  is  a  pointer 
to  the  first  link  of  the  childlist. 

The  childnode  is  a  record  with  three  fields:  path, 
child,  and  next.  The  "next”  field  provides  the  linked  list 
structure.  The  "path"  field  is  a  string  type  and  records 
the  name  of  a  path  to  a  child  of  the  parent  programnode,  as 
dictated  by  the  synthesis  part  of  the  parent’s  syntax  refer¬ 
ence.  The  child  field  is  a  pointer  to  that  child,  as 
described  in  the  above  paragraph.  The  children  of  a 
programnode  are  thus  grouped  through  a  linked  list  of  child- 
nodes.  The  particular  child  in  question  is  accessed  by 
traversing  the  list  until  the  appropriate  path  is  found. 

Figure  4.4  represents  a  fiinigol  program  tree  segment  for 
an  integer  declaration.  The  figure  includes  three  program 
nodes,  two  of  which  are  children  of  the  third.  The  two 
children  are  accessed  through  two  childnodes.  (Note  in  the 
figure  that  the  "t"  childnode  accesses  an  "int"  program  node 
rather  than  a  "type”  node,  as  expected  in  the  grammar  rule 
for  declarations.  This  is  because  "type”  is  an  example  of 
an  identity  rule  as  described  in  Section  3.D.) 

Chapter  3  mentioned  the  use  of  "CN”  and  "CP"  to  keep 
one's  place  within  a  program  tree.  Note  that  CN  is  a 
pointer  to  a  program  node,  while  CP  is  a  pointer  to  a  child- 
node.  In  Figure  4.4,  for  example,  the  "integer"  portion  of 
the  declaration  would  be  the  current  position  if  the  CN 
referenced  the  "decl"  node  and  the  CP  referenced  the  "t" 
childnode. 

It  remains  to  be  discussed  how  the  above  data  types  are 
used  to  implement  the  "special"  grammar  features  described 
in  Section  3.C:  the  sequence  conventions  (Kleene  "*", 
Kleene  and  and  the  optional  node  {"?"). 
Sequences  are  implemented  by  creating  SDE-defined  "sequence 
nodes."  These  are  programnodes  named  "seg"  having  exactly 
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(ptr  from  parent  node’s  childlist) 
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NAME: 

SYNTAX: 

PARENT: 

CHILDLIST: 


♦ - ■» 

1  decl  | 


♦ - >  (to  defn  part  of  ’decl’  rule) 

+ - >  (to  this  node’s  parent  node) 


Jtr>i — I — 1  — 

PATH:  (1)  ♦ - >j  n  I  1  t  | 

NEXT:  — + - >  I  — + - >  ni. 

CHILD:  I  I 

T  f 


NAME: 

SYNTAX: 

PARENT: 

CHILDLIST: 


+ - ♦  + - 1 

id  (to  i  int  |  (to 

0< -  defn)  (T)<-+ —  I  defn) 

-i-  ,-H 

«• - >  ♦ - >  nil 

(to  path  to  child) 


Figure  4.4  Minigal  Tree  Segment:  Integer  Declaration 


two  sons,  the  first  of  which  is  the  node  type  being 
sequenced,  and  the  second  of  which  is  either  a  ’’nil"  node, 
which  ends  the  sequence,  or  another  "seq"  node.  A  sequence 
is  thus  implemented  as  a  chain  of  ’’seq”  nodes  to  each  of 
which  is  attached  one  item  in  the  sequence. 

The  ’’nil”  node  mentioned  above  is  a  programnode  named 
"nil"  having  nil  references  in  both  its  syntax  and  childlist 
fields.  It  is  used  both  to  end  a  sequence  and  to  waive  an 
optional  node.  For  example,  suppose  an  "if"  production 
contains  an  optional  "else"  clause.  If  the  user  elects  to 
use  this  clause,  then  construction  continues  as  if  the 
clause  were  mandatory.  If  the  user  chooses  to  omit  this 
clause,  however,  the  childnode  representing  this  path  is  not 
destroyed;  rather,  a  "nil"  node  is  created  and  referenced  by 


the  childnode.  The  parent  programnode  thus  retains  paths  to 
all  of  its  children  as  dictated  by  its  synthesis  reference, 
even  if  some  of  those  children  are  not  used.  Note  that  the 
absence  of  a  "nil"  node  at  the  end  of  a  childnode  is  inter¬ 
preted  as  an  incomplete  tree  segment  by  the  SDE. 

Figure  4.5  demonstrates  the  use  of  both  "seq"  and  "nil” 
nodes.  Note  that  the  syntax  references  of  the  "seq"  nodes 
refer  to  the  rule  that  generated  the  sequence,  i.e.,  the 
rule  that  created  the  parent  of  the  entire  sequence.  Note 
also  that  the  first  path  name  in  the  "seq"  node's  childlist 
is  inherited  from  this  same  rule.  The  second  childnode' s 
name,  however,  is  always  "next. " 

Finally,  user  selections  from  a  grammar  set  are  stored 
in  a  unique  way.  Sets,  as  stated  above,  are  useful  in  a 
grammar  to  note  the  characters  that  can  correctly  compose 
such  program  entities  as  variable  names,  numeric  constants, 
and  subroutine  names.  Because  the  SDE  checks  only  for 
syntactic  accuracy,  any  character  sequence  is  valid  if 
syntactically  correct  —  that  is,  variables  need  not  have 
been  declared,  scoping  rules  do  not  apply,  and  type  clashes 
are  irrelevant.  In  short,  the  only  important  information 
contained  in  a  name  is  the  name  itself.  It  would  prove 
wasteful  of  storage  to  allocate  a  separate  node  for  each 
letter  in  each  variable  name.  The  SDE  therefore  takes  a 
different  approach  by  creating  a  "str"  node.  Onlike  "seq" 
or  "nil"  nodes,  a  "str"  node  is  not  given  the  name  "str," 
but  rather  assumes  the  name  of  the  particular  set  in  ques¬ 
tion.  This  node  has  a  syntax  value  of  nil  and  at  least  one 
childnode.  The  first  ten  characters  in  the  variable  name 
are  stored  in  the  "path"  field  of  this  childnode,  and  addi¬ 
tional  childnodes  are  created  and  linked  together  as  needed 
to  house  the  full  name.  Note  that  no  programnodes  are 
constructed  at  the  end  of  any  of  the  childnodes.  The  value 
of  the  childnodes  is  in  the  path  names  themselves. 
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Figure  4.5  Sequence  of  Declarations 


D.  SOUS  IHPLEHENTATICN  PRIMITIiES 

In  this  section  are  presented  some  functions  and  proce¬ 
dures  used  by  the  SDE  to  peform  ’’primitive”  operations. 
Their  understanding  both  facilitates  explanation  of  higher- 
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level  operations  and  provides  insight  into  the  organization 
of  the  SDE  as  a  program. 

"Findrule"  is  a  function  that,  given  a  name,  checks  the 
list  of  grammar  rules  to  see  if  it  includes  a  rule  by  the 
given  name.  If  it  does,  a  pointer  to  the  entire  rule  is 
returned;  otherwise  a  nil  value  is  returned.  "Findset"  is  a 
similar  function  that  tries  to  match  the  argument  with  an 
entry  in  the  list  of  grammar  sets.  The  purpose  of  these  two 
functions  is  to  retrieve  the  rule  or  set  entry  in  a  grammar 
when  only  a  name  is  known.  The  use  of  the  nil  value  also 
serves  as  a  Boolean  indication  that  the  name  is  not  in  the 
specified  list. 

•'Lookup"  is  a  function  whose  arguments  are  a  pointer  to 
a  program  node  and  the  name  of  cne  of  the  node's  childpaths. 
The  function  uses  the  node's  syntax  reference  to  access  the 
definition  that  generated  the  node,  then  matches  the  name  of 
the  childpath  with  an  entry  in  the  nonterminal  dictionary. 
The  function  returns  a  pointer  to  this  dictionary  entry. 
"Lookdn,"  on  the  other  hand,  is  called  with  a  pointer  to  a 
programnode  and  a  dictionary  entry  as  arguments.  The  func¬ 
tion  traverses  the  node's  childlist  to  find  a  match  between 
a  childpath  name  and  the  name  in  the  dictionary  entry,  and 
returns  a  pointer  to  the  childnode  thus  found.  These  two 
functions  are  used  in  a  variety  of  ways  throughout  the  SDE, 
both  individually  and  together.  For  example,  when  moving  to 
a  "right"  brother,  "lookup"  returns  the  dictionary  entry  of 
the  present  CP;  the  entry's  "next"  field  is  accessed  to 
obtain  the  right  brother's  path  name;  and  "lookdn"  is  used 
to  move  the  CP  to  the  appropriate  childnode. 

The  last  set  of  primitives  to  be  discussed  are  those 
that  create  nodes  in  the  program  tree.  "Makenode"  is  a 
function  that  returns  a  pointer  to  a  new  programnode  (which 
is  created  as  a  "side  effect"  by  the  function).  To  make  the 
node,  the  function  is  provided  a  pointer  to  a  definition  in 
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the  grammar  an d  the  value  of  CN.  It  creates  a  new  node  with 
a  name  as  determined  by  the  "tag"  of  the  dictionary's 
synthesis  part,  a  syntax  pointer  to  the  definition  itself,  a 
parent  pointer  to  the  CN,  and  a  list  of  childnodes  as  indi¬ 
cated  by  the  attribute-value  pairs  in  the  definition's 
synthesis  part.  Any  terminal  sons,  as  indicated  in  the 
synthesis  part,  are  also  created  at  this  time. 

"Makeseq"  is  a  function  that  creates  a  ''seg"  node  with 
exactly  two  childnodes.  Since  the  syntax  and  pathname 
values  are  determined  by  the  parent  of  the  sequence  and  not 
by  the  expected  child,  no  definition  pointer  is  required  as 
input;  instead,  "makeseq"  is  called  by  providing  the  CN  and 
CP  values.  The  CN  provides  the  "seq"  node's  parent  refer¬ 
ence,  as  in  "makenode."  If  the  function  is  creating  the 
first  "seq"  node  in  a  sequence,  the  CP  provides  the  pathname 
which  will  be  copied  in  the  "seq"  node's  first  childnode; 
otherwise,  this  information  is  copied  from  the  parent 
("seq")  node's  first  childnode,  and  the  CP  is  ignored.  The 
result  of  this  process  is  that  the  syntax  reference  and 
pathname  provided  by  the  original  parent  are  passed  down 
through  the  sequence  as  new  "seq"  nodes  are  created. 
Identifying  what  type  of  sequence  one  finds  himself  in  is 
therefore  a  simple  operation. 

"Makenil"  is  a  function  that  returns  a  pointer  to  a  new 
node  named  "nil."  The  new  node  is  also  given  nil  values  in 
its  syntax  and  childlist  fields.  The  purpose  of  this  func¬ 
tion  is  to  create  a  node  that  "ties  off"  sequences  or 
optionals  that  are  not  acted  upon.  Its  single  argument  is 
the  CN  value. 

"Makestr"  is  a  function  that  creates  a  programnode  to  be 
linked  to  a  string  of  characters  as  specified  by  one  of  the 
sets  in  the  grammar.  The  function  is  invoked  by  providing  a 
pointer  to  an  at  trib  ute-  value  pair  (representing  a  child  of 
the  "tag"  field  node)  in  a  rule's  synthesis  part;  it  returns 
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a  pointer  to  a  newly  created  node.  The  new  node  is  given 
the  same  name  as  the  "value"  portion  of  the  input  pair  -- 
note  that  it  will  always  be  the  name  of  a  set  in  the 
grammar.  The  new  node  is  given  a  nil  syntax  reference,  like 
the  node  created  in  "makenil,"  tut  is  also  given  one  child- 
node,  the  path  name  of  which  is  initialized  to  an  "open" 
symbol  followed  by  blanks.  (Later,  as  characters  are  added 
to  the  string,  they  will  be  placed  in  successive  locations 
in  the  pathname  of  this  childnode,  and  the  "open"  symbol 
will  be  pushed  toward  the  end  of  the  word.  As  the  last 
character  in  the  word  is  occupied,  a  new  childnode  will  be 
attached  to  the  previous  one,  and  the  string  will  continue 
in  the  pathname  of  this  childnode.  iihen  the  user  ends  the 
string  with  the  "endstr"  command,  the  "open"  symbol  will  be 
changed  to  a  "closed"  symbol,  indicating  the  completion  of 
the  string.)  Note  again  that  unlike  "seg"  and  "nil"  nodes, 
whose  name  fields  contain  "seg"  and  "nil"  respectively,  a 
"str"  node  is  given  the  name  cf  the  set  whose  members  it 
contains  in  its  childnodes.  Also,  since  the  user-provided 
string  is  stored  in  the  childnodes,  the  names  of  these  nodes 
are  not  dictated  by  the  grammar. 

E.  DETERfllNATIOH  AND  DISPLAY  01  LEGAL  CHOICES 

Since  the  SDE  is  menu-driven,  a  mechanism  is  required  to 
determine  what  selections  should  be  made  available  to  the 
user.  If  this  information  is  retained  by  the  program  rather 
than  simply  displayed  and  forgotten,  it  can  also  be  used  to 
insure  the  user's  input  command  is  legal  before  it  is 
processed,  thereby  catching  errors  early  and  eliminating  the 
need  for  error  detection  and  recovery  in  the  command 
processing  portion  of  the  program.  For  these  reasons,  the 
set  of  legal  user  commands  is  assembled  into  a  linked  list, 
displayed  in  the  menu,  and  retained  for  comparison  against 
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V.  APPRAISAL  OF  THE  SDE  AND  SYNTAX-DIRECTED  EDITING 

A.  MEETING  SDE  DESIGN  REQUIREMENTS:  TRADE-OFFS  AND 

SHORTCOMINGS 

The  SDE  was  designed  to  meet  several  objectives.  It  was 
to  he  highly  interactive  to  facilitate  ease  of  use,  yet  be 
table-driven  both  to  support  a  number  of  programming 
languages  and  to  enable  use  on  any  terminal.  As  a  program, 
it  was  to  be  portable  from  one  operating  environment  to 
another  and  facilitate  certain  modifications  a  user  may  wish 
to  make  to  the  SDE.  Finally,  a  general  objective  of  the  SDE 
was  to  be  as  easy  to  use  as  a  text  editor:  the  user  should 
not  have  to  "pay"  for  the  advantages  of  syntax-directed 
editing  by  enduring  the  clumsiness  of  the  tool. 

The  SDE  has  many  features  that  achieve  interactivity, 
perhaps  the  most  obvious  of  these  being  its  menu.  It  was 
felt  that  users  taking  advantage  of  the  SDE's  multi-language 
capability  should  not  be  required  to  know  the  details  cf  any 
input  grammar;  menu  display  was  therefore  appropriate. 
However,  menus  often  become  burdensome  in  interactive 
programs,  especially  as  the  user  gains  familiarity  with  the 
program  and  no  longer  needs  to  be  reminded  of  his  options. 
For  this  reason  the  SDE  permits  suppression  of  the  menu,  an 
option  which  has  the  added  advantage  of  presenting  a  larger 
part  of  the  edited  program  on  the  display. 

The  use  of  two  passes  through  the  unparser  involved  a 
design  trade-off  to  enhance  interactivity.  Only  one  pass  is 
required  to  convert  the  program  tree  to  textual  form. 
However,  experience  with  this  approach  revealed  that  the 
unparser  often  scrolled  needed  information  off  the  screen, 
hiding  the  program  portion  being  edited  and  making  the  menu 
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"A",  so  there  are  two  synthesis  parts  whose  tag  fields  cause 
the  creation  of  a  node  representing  "A"  in  the  tree.  These 
nodes  must  be  unigue,  however,  so  "al"  and  "a2"  are  used. 
In  this  particular  case,  finding  the  incorrect  synthesis 
part  would  have  destructive  effects  since  the  number  of  sons 
differs  in  the  two  rules.  The  program  would  thus  either 
encounter  too  few  or  too  many  names  in  the  stored  file. 
However,  consider  what  would  happen  if  two  synthesis  parts 
had  identical  tag  fields  and  similar  children  in  order  and 
type.  "Finasyn"  would  find  the  first  occurrence  of  the  tag, 
and  because  the  numbers  and  orders  of  children  agree,  the 
tree  would  be  constructed  without  abnormal  termination.  Ihe 
syntax  reference  of  the  node  created  would  reference  the 
rule  found  by  "findsyn,"  with  the  result  that  the  text 
unparsed  from  this  portion  of  the  tree  would  reflect  the 
analysis  portion  of  the  first  rule,  not  the  one  intended. 
The  text  form  of  the  program  thus  would  be  incorrect  and 
misleading. 

On  the  other  hand,  the  SDE’s  manner  of  tree  storage  and 
reconstruction  gives  the  editor  the  potential  for  transla¬ 
tion  between  programming  languages.  This  potential  will  be 
further  discussed  in  Chapter  5. 
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Reconstruction  of  the  tree  from  its  stored  form  is  some¬ 
what  more  difficult  than  storing  the  tree.  The  reading  of 
the  first  character  in  each  name  reveals  whether  it  is  a 
string  (the  character  being  " *"),  an  incomplete  tree  segment 
("<")  ,  a  terminal,  "seg"  or  •'nil”  node  (the  guote)  ,  or  the 
"tag”  field  of  a  grammar  rule's  synthesis  part.  If  it  is  a 
tag  field,  function  "findsyn"  searches  the  grammar  to  find 
the  responsible  synthesis  part.  Osing  this  synthesis  part, 
the  SDE  constructs  a  new  program  segment  just  as  it  would 
during  an  editing  session.  It  also  uses  the  synthesis  part 
to  learn  how  many  sons  the  tag  node  has,  so  "readprogram" 
can  call  itself  recursively  the  appropriate  number  of  times. 

Encountering  a  "segn  node  in  the  stored  form  causes  the 
creation  of  a  "seg"  node  in  the  program  tree.  Because  a 
"seg"  node  always  has  two  sons,  "readprogram"  calls  itself 
twice  recursively.  all  other  names  preceeded  by  a  guote  in 
the  stored  form  are  given  nil  syntax  and  child  references 
and  cause  return  from  the  recursion. 

Strings  are  also  reconstructed  based  on  the  stored 
information.  The  name  of  the  "str"  node  to  be  created  is 
determined  by  using  "lookup"  tc  determine  the  name  of  the 
expected  node  at  this  point  in  the  tree.  (The  expected  node 
will  always  either  be  a  set  cr  lead  to  a  set  through  a 
series  of  identity  rules.)  The  pathnames  of  the  "str" 
node's  childnodes  are  input  from  the  string  in  the  stored 
file.  If  the  string  is  open,  the  subseguent  nonterminal 
enclosed  by  the  "<"  and  ">"  symbols  is  read  and  ignored;  its 
only  value  is  to  provide  documentation  to  the  user,  should 
he  wish  to  inspect  the  tree's  storage. 

The  only  restriction  this  routine  places  on  input  gram¬ 
mars  is  that  the  tag  field  of  every  synthesis  part  must  be 
unigue  in  order  to  guarantee  that  "readsyn"  will  find  the 
correct  synthesis  part.  Consider  the  grammar  in  Appendix  D, 
for  example.  There  are  two  productions  from  the  nonterminal 
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H.  STORAGE  AND  RETRIEVAL  OF  THE  PROGRAM  TREE 


The  final  implementation  detail  to  be  discussed  is  the 
storage  of  the  program  tree  at  the  end  of  a  session,  and  the 
reconstruction  of  the  tree  from  the  stored  form  during 
subseguent  sessions. 

Storing  the  program  tree  is  done  simply  by  writing  the 
name  of  each  node  encountered  during  a  preorder  traversal  of 
the  tree.  Terminals  of  the  translation  (as  designated  in  a 
rule’s  synthesis  part),  sequence  nodes,  and  "nil"  nodes  are 
preceeded  by  a  quote  mark.  When  childnodes  with  no  nodes  at 
their  ends  (signifying  an  incomplete  program  tree)  are 
encountered,  the  expected  node  name  is  printed,  surrounded 
by  the  symbols  "<"  and  ">".  Strings  stored  in  sets  are 
preceeded  by  an  apostrophe  and  printed  as  continuous 
strings,  even  if  they  extend  through  several  path  names. 
The  "open"  and  "closed"  symbols  are  also  printed  at  the  end 
of  strings.  However,  if  the  string  was  not  closed,  the  name 
of  the  set  type  is  also  printed,  surrounded  by  the  "< "  and 
">"  symbols.  Figure  4.8  shows  the  stored  form  of  the 
"Minigol"  program  presented  in  Figure  2.  2.  Note  that  the 
letter  "K"  is  used  to  represent  the  "closed"  symbol;  the  SDE 
actually  uses  an  unprintable  control  character  for  this 
purpose. 
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block  "seg  decl  id  'iK  int  "sec  decl  id  'iK  int  "nil 
"seq  asn  id  ’ iK  num  'OK  "seg  while  rein  It  id  'iK 
nun  '10K  block  "nil  "seq  asn  id  ' iK  mull  id  ' iK  id 
'iK  "seq  asn  id  ' iK  add  id  ' iK  num  'IK  "nil  "nil 
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Figure  4.8  Stored  Form  of  Sample  Minigol  Program 


the  case  of  the  ellipsis  (with  affix  of  the  first  son 
of  the  "seg11  node  is  unparsed,  the  character  used  to  sepa¬ 
rate  the  sequence  items  is  printed  (if  there  are  more  items 
in  the  sequence)  ,  and  the  second  son  (another  "seq"  node)  is 
unparsed. 

Note  that  the  entire  tree  need  not  be  unparsed  every 
time  the  screen  is  refreshed.  Rather,  only  the  visible 
portion  of  the  tree  has  to  be  unparsed.  (Jnparsing  therefore 
begins  at  the  focus  node,  not  necessarily  the  root  node,  and 
terminates  when  depth  limits  have  been  exceeded  or  the 
screen  is  already  full. 

,,Unparse,,  uses  three  formatting  instructions  from  the 
input  grammar.  The  grammar  designer  may  specify  formatting 
to  effect  "prettyprinting"  by  placing  these  single-character 
instructions  in  the  terminal  sequences  of  the  analysis  part 
of  rules.  The  symbol  "V  denotes  carriage  return,  while  "\" 
and  "1"  denote  two-space  indent  and  outdent,  respectively. 
Note  that  "\"  has  two  effects:  it  immediately  causes  the 
printing  of  two  blanks,  and  also  moves  the  unparser’s  record 
of  left-most  column  justif ica tion  two  spaces  to  the  right. 
The  " !"  symbol,  on  the  other  hand,  only  has  the  effect  of 
moving  the  left  justification  hack  two  spaces  to  the  left. 
An  example  of  the  use  of  these  symbols  may  be  found  in 
Appendix  E. 

In  addition  to  the  formatting  as  specified  by  the 
grammar,  the  unparser  also  forces  a  carriage  return  and 
two-space  indent  (not  affecting  the  left  justification) 
whenever  a  line  is  too  long  for  the  display  (as  recorded  in 
the  "TERM"  file)  .  In  this  way  the  unparser  avoids  splitting 
of  tokens  onto  two  lines  and  keeps  track  of  the  actual 
number  of  lines  involved  in  the  display. 
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bottom  few  lines  of  the  display  are  unavailable  for  program 
text.  Finally,  whatever  is  displayed  on  the  screen  must 
include  the  CP’s  descendants  fcr  the  menu  to  be  meaningful 
and  for  the  user  to  view  the  program  portion  he  is  manipu¬ 
lating.  The  SDE  solves  these  problems  by  making  two  passes 
through  the  unparser.  In  the  first  pass,  the  unparser 
converts  the  program  tree  to  text  but  does  not  send  it  to 
the  screen.  Rather,  it  records  the  number  of  lines  between 
the  first  line  to  be  displayed  and  the  beginning  of  the 
display  of  the  CP’s  descendants,  then  calculates  how  many 
lines  must  be  un parsed  but  net  displayed  so  that,  when 
display  does  commence,  the  CP's  descendants  will  appear  on 
the  center  line  of  the  usable  screen  area.  During  the 
second  pass  of  the  unparser,  text  is  sent  to  the  screen  only 
after  the  calculated  number  of  lines  is  unparsed.  Likewise, 
unparsing  terminates  after  a  prescribed  number  of  lines  is 
displayed  (to  avoid  scrolling)  . 

The  SDE  allows  the  user  to  select  the  depth  of  his 
display.  As  stated  earlier,  he  can  use  this  feature  to  view 
the  entire  program  from  a  broad  perspective  or  inspect  a 
small  portion  in  its  greatest  detail.  The  feature  is  imple¬ 
mented  by  passing  a  depth  parameter  to  "unparse,"  which 
decrements  the  value  upon  receipt.  When  "unparse"  calls 
itself  recursively,  it  passes  the  decremented  value.  A 
non-positive  value  causes  the  procedure  to  display  only 
"..."  and  return;  nc  further  unparsing  is  performed  on  this 
tree  branch.  The  effect  is  that  unparsing  proceeds  to  the 
desired  depth  and  no  further.  Note  that  sequences  are  all 
considered  to  be  of  the  same  depth;  thus  a  degradation  of 
each  succeeding  item  in  the  sequence  is  avoided. 

Sequences  occur  in  a  tree  when  a  grammar  rule  specifies 
a  nonterminal  with  an  affix  of  "+",  or  Unparsing 
the  first  two  cases  is  performed  simply  by  ignoring  the 
"seq"  nodes  and  recursively  unparsing  their  two  sons.  In 


{declarations:  see  Appendix  A  for  types 
x:  alist;  (for  nt  diet  entries) 

H:  grammar:  {rules) 

ef:  defnptr;  (ptr  to  definition  part  of  rule) 
done,  found:  boolean;  ) 


{global  variables  affected: 

cb:  char;  {the  user's  command) 
cn:  nodeptr  (the  current  node) 
cp:  chilaptr  (the  current  path) 


} 


done:=  false; 
repeat 

x:=  lookup  (cn,  cpA.path):  {RETURNS  DICT  ENTRY) 
g:=  findrule  (xA.val):  {RETURNS  RULE  PTR  OR  NIL. 

NOTE  G  IS  RULE  FOR  EXPECTED  NODE  AT  END  OF  CP} 
found:=  false; 

while  (not  found)  and  (g  <>  nil)  do 
if  gA .isalternation  then  found:=  true 
else  if  gA. defnA. syn  <>  nil  then  found:=  true 
else  g:=  findrule  (gA.  defnA.  anal  .info); 

{AT  THIS  POINT,  G  EITHER  IS  NIL  (MEANING  A  SET  WAS 
FOUND),  POINTS  TO  AN  ALTERNATION,  OR  POINTS  TO  A 
NON- ALTER NATION,  NON-IDENTITY  RULE  WITH  AT  LEAST 
ONE  NONTERMINAL  SON.  THIS  IS  GUARANTEED  IF  THE 
GRAMMAR  WAS  DESIGNED  PROPERTY. } 

if  g  <>  nil  then  begin 

{HERE  IS  FOUND  CODE  TO  CREATE  A  "SEQ"  NODE,  IF 
APPROPRIATE.  CODE  OMITTED  IN  THIS  FIGURE} 
if  gA.isalternation  then  begin  {ALGO.  STEP  4} 


def:=  findoption  (g)  ; 

while  defA.syn  =  nil  dc  begin 

H:=  findrule  (def A. analA. info) ; 

of*=  n *  aofn* 


__ef;=  gA.  defn; 
end 


{IDENTITY} 


{NOW  f)EFA.  SYN  POINTS  TC  THE  NODES  TO  BE  ADDED} 
cpA.child:=  makenode  {def,  cn)  ; 
cn  :=  cpA.  child;  (THE  NCDE  JUST  CREATED} 
if  cnA.syntaxA.ntdict  <>  nil  then 
cp:=  lookdn  (cnA. syntaxA.ntdict ,  cn) 

{I.  E.  ,  SET  CP  TO  FIRST  NONTERMINAL  SO 


N} 

else  moveup;  {OR  FIND  NEXT  OPEN  SPOT) 
done: =  true;  [EXIT  THE  REPEAT  LOOP} 
end 

else  begin  {NON- ALTERNATION  RULE,  ALGO  STEP  5} 
cpA.cfiild:=  makenode  (gA.defn,  cn)  ; 
cn:=  cpA. child; 
cp:=  lookdn  (cnA.synt< 
end; 
end ; 

until  (g  =  nil)  or  done; 


.ax*. ntdict,  cn)  ; 


{AT  THIS  POINT,  EITHER  AN  ALTERNATION  HAS  BEEN  FOUND 
AND  ACTED  UPON,  OR  ELSE  A  SET  WAS  FOUND.  IF  A  SET 
WAS  FOUND,  IT  IS  HANDLED  IN  SUBSEQUENT  STATEMENTS.} 


Figure  4.7  Portion  of  1 Checkspeccmds* 


altercation  or  set.  The  coding  in  this  routine  follows  the 
algorithm  for  node  creation  as  presented  in  Section  3.D,  and 
relies  on  the  restrictions  of  input  grammars  as  discussed  in 
that  section  and  in  Appendix  C.  Figure  4.7  contains  the 
kernel  of  the  procedure.  As  mentioned  above,  "check- 
speccmds"  continues  to  add  nodes  until  either  an  alternation 
or  set  selection  is  found.  The  user’s  command  is  then  acted 
upon,  and  the  procedure  terminates. 

G.  UHPARSIHG:  DISPLAY  OF  THE  PEOGBAM  TREE 

Section  3.D  presented  a  brief  summary  of  the  algorithm 
used  by  the  SDE  to  convert  the  program  tree  to  textual  form. 
This  conversion  is  performed  by  the  recursive  procedure 
"unparse,"  which  includes  mechanisms  to  handle  sequences  and 
sets.  When  unparsing  for  screen  display,  "unparse"  also 
activates  the  screen’s  inverse  video  (as  directed  in  the 
"TERM"  file)  while  the  CP’s  descendant  programnodes,  which 
reflect  the  user’s  "current  position,"  are  being  unparsed. 

As  one  of  its  arguments,  "unparse"  is  passed  a  pointer 
to  a  node  in  the  program  tree.  While  this  parameter  is 
sufficient  to  effect  conversion  of  the  tree  to  textual  form, 
two  additional  parameters  are  required  to  insure  proper 
display  of  the  text  on  the  screen.  One  is  an  integer  repre¬ 
senting  the  current  "depth  limit,"  or  the  number  of  genera¬ 
tions  of  descendants  the  user  wishes  to  be  displayed.  The 
other  parameter  is  a  Boolean  value  reflecting  whether  this 
invocation  of  the  procedure  is  being  made  in  the  first  or 
second  "pass"  of  the  unparsing  effort,  as  discussed  below. 

Unlike  a  text  file,  the  terminal  screen  can  only  display 
a  fixed  number  of  lines  at  a  time.  Suppose  a  terminal 
displays  24  lines.  If  the  SDE  displays  more  than  this 
number  of  lines,  the  first  lines  will  be  scrolled  up  and  off 
the  screen.  Further,  if  the  menu  is  to  be  displayed,  the 


which  dynamically  creates  new  nodes  and  gives  them  all  the 
attributes  of  the  nodes  in  the  grabbed  segment  except  those 


that  form  the  tree  structure  (such  as  parent,  next,  and  so 
on) .  The  structure  fields  are  given  new  values  that  refer 
to  (and  result  in)  the  new  structure.  Note  that  because  no 
garbage  collection  is  performed  when  segments  are  erased, 
such  segments  still  reside  in  main  memory  and  are  therefore 
available  to  be  copied  (if,  of  course,  they  were  '•grabbed” 
prior  to  being  erased). 

The  ''insert  before”  command  is  implemented  in  an  oppo¬ 
site  manner  from  the  "erase”  for  sequences.  Whereas  "erase" 
removes  a  link  from  a  chained  sequence  structure,  "insert" 
installs  a  link  into  the  sequence.  Parent  and  child  refer¬ 
ences  of  surrounding  nodes  are  adjusted  to  accomodate  the 
new  link. 

The  implementation  of  "endstring"  depends  on  whether  a 
user-defined  string  is  to  be  closed  or  a  sequence  or  option 
is  to  be  ended.  A  string  is  closed  by  changing  the  "open" 
symbol  at  the  end  of  the  "str"  node's  string  to  a  "closed" 
symbol.  A  sequence  or  option  is  closed  by  putting  a  "nil" 
node  at  the  end  of  the  CP. 

finally,  the  toggle  commands  ("dsply"  and  "move")  are 
implemented  by  negating  the  previous  values  of  Boolean  vari¬ 
ables.  Using  the  "display  toggle"  also  adjusts  a  variable 
containing  the  number  of  lines  cn  the  screen.  When  the  menu 
display  is  disabled,  the  unparser  plans  the  display  to 
utilize  the  larger  screen  size. 

The  above  paragraphs  describe  only  the  standard  commands 
input  by  the  user.  The  special  commands  are  implemented  in 
an  iterative  procedure  called  "checkspeccmds. "  Note  that 
whereas  the  routine  to  determine  legal  commands  (presented 
in  Section  4.E)  stepped  through  grammar  rules  to  find  an 
alternation  or  set,  "checkspeccmds"  creates  the  nodes 
dictated  by  the  rules  along  the  way  to  act  on  the 
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"Erase”  is  another  command  implemented  in  two  ways 
depending  on  whether  in  a  sequence.  Normally,  the  CP's 
childpointer  is  simply  set  to  nil,  thereby  losing  the 
program  segment  previously  referenced.  (Note  that  the  SDE 
performs  no  "garbage  collection.")  When  the  CP  references  an 
item  in  a  seguence,  however,  the  higher  and  lower  portions 
of  the  seguence  structure  are  "tied  together"  so  that  only 
the  single  link  is  erased. 

"Change  focus,"  "change  depth,"  and  "grab"  are  easily 
implemented.  For  "change  focus,"  a  variable  named  "focus" 
is  set  to  reference  the  CP's  programnode  child,  and  the  CN 
and  CP  are  changed  as  if  the  "child"  command  were  entered. 
Since  the  "focus"  value  determines  the  location  in  the  tree 
from  which  unparsing  begins,  only  this  subtree  is  displayed. 
For  "change  depth,"  an  integer  is  retrieved  from  the  command 
line  (or  else  the  user  is  prompted  for  one)  ,  and  this  value 
is  assigned  to  the  "depth"  variable.  For  "grab,"  a  one¬ 
digit  integer  is  again  retrieved  from  the  command  line  or 
from  the  user,  and  this  value  is  used  as  an  array  index  to 
store  a  pointer  to  the  CP's  child. 

"Put,"  the  complement  of  "grab,"  is  a  more  complicated 
operation.  First,  the  segment  to  be  implanted  must  be 
compatible  with  the  expected  node  at  the  end  of  the  CP. 
Also,  the  nodes  in  the  accessed  segment  must  be  copied  and 
their  parent  references  changed  so  that  the  installed 
segment  is  completely  independent  of  its  model  elsewhere  in 
the  tree.  Compatibility  is  checked  by  comparing  the  root  of 
the  segment  to  be  implanted  against  all  the  nodes  expected 
at  that  point  in  the  tree  based  on  the  grammar  rules.  Set 
compatibility  can  be  checked  because  the  "str"  node  retains 
the  name  of  the  set  whose  members  it  contains. 
Compatibility  of  sequences  is  checked  by  comparing  the  node 
type  of  the  items  in  the  seguence.  Duplication  of  the 
grabbed  program  segment  is  done  through  a  "copy"  routine. 


command  string  is  ignored,  the  screen  is  refreshed,  and  a 
message  informs  the  user  that  he  entered  an  illegal 
selection. 

As  mentioned  in  Section  3.D,  user  commands  may  he 
categorized  as  either  "standard"  or  "special."  Each  command 
in  the  command  string  is  first  examined  by  the  routine  that 
implements  standard  commands.  If  this  routine  does  not  act 
on  the  command,  it  is  instead  processed  by  the  routine  that 
handles  special  commands.  The  "stdcmd"  routine  is  therefore 
implemented  as  a  Boolean  function  that  returns  whether  or 
not  it  acted  on  the  command;  any  action  taken  is  performed 
as  a  "side  effect."  "Checkspeccmds"  is  a  procedure  invoked 
only  when  it  is  already  known  that  the  command  is  not  a 
standard  command. 

Implementation  of  the  standard  commands  "right,"  "left," 
"parent,"  and  "child"  were  discussed  in  Section  3.D.  That 
discussion,  however,  does  not  apply  when  the  CN  is  inside  a 
sequence.  In  such  a  situation,  "parent"  moves  the  CN  and  CP 
to  the  beginning  of  the  sequence  so  that  the  CN  references 
the  parent  of  the  entire  sequence  and  the  CP  references  the 
path  that  leads  to  the  sequence.  The  entire  sequence  is 
therefore  highlighted  during  subsequent  unparsing.  "Right" 
and  "left"  are  implemented  so  that  the  CN,  CP  pair  move 
"down  and  right"  or  "up  and  left"  in  the  sequence  structure 
to  reference  the  next  lower  or  next  higher  item  in  the 
sequence.  During  unparsing,  the  right  or  left  item  on  the 
display  is  highlighted.  The  "child"  implementation  is  unaf¬ 
fected  by  the  presence  of  a  sequence. 

The  "rest  seq"  command  is  implemented  by  simply  shifting 
the  CP  to  the  second  son  of  the  CN.  Since  this  command  is 
legal  only  when  the  CN  is  a  "seq"  node,  the  CP  references 
its  first  son,  and  the  second  son  exists  and  is  not  "nil," 
moving  the  CP  to  the  CN's  second  son  positions  it  over 
another  "seq"  node.  The  CP  thus  accesses  a  "seq"  node  whose 
descendants  include  the  remaining  items  in  the  sequence. 
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a)  If  the  CP  points  to  the  first  son  of  the  "seq"  node 
(i.e.,  to  one  of  the  items  in  the  sequence),  then  add 
"right"  (which  means  the  next  item  in  the  sequence,  to 
the  user)  and  "insert  before"  to  the  list.  Also  add 
"rest  seq"  if  the  second  sen  of  the  "seq"  node  exists 
and  is  not  "nil."  Further,  if  the  CN's  parent  is  also  a 
"seq"  node,  add  "left."  (The  user  should  be  able  to 
travel  left  and  right  to  items  in  the  sequence  without 
knowing  how  the  sequence  is  implemented.  A  "seq"  node 
above  the  CN  means  the  CP  does  not  reference  the  first 
item  in  the  sequence,  so  there  are  items  to  its  left  on 
the  display)  ; 

b)  If  the  CP  points  to  the  CN* s  second  son,  add  "left" 
to  the  list.  If  the  CP  also  has  no  program  node  at  its 
end,  add  " endstr." 

Display  of  the  legal  choices  on  the  menu  is  accomplished 
by  accessing  each  node  in  the  list  of  choices  and  displaying 
the  contents  of  the  node.  Control  characters  are  unprint¬ 
able  on  a  screen,  so  they  are  displayed  as  the  two-character 
sequence  of  and  the  printable  key  to  be  typed  on  the 
keyboard.  If  the  display  routine  detects  a  set  in  the  list 
of  commands,  "any"  and  the  name  of  the  set  are  displayed. 

F.  PBOCESSING  THE  USEE'S  COMMANDS 

Every  command  entered  by  the  user  is  checked  against  the 
current  list  of  legal  commands.  A  command  is  determined  to 
be  legal  if  it  either  matches  one  of  the  commands  in  the 
command  list  or  is  a  member  cf  the  set  included  in  the 
command  list  (as  discussed  in  Step  3  of  the  algorithm 
presented  in  the  previous  section)  .  If  the  command  is 
legal,  it  is  fowarded  to  the  command  processing  routines  of 
the  SDE.  If  it  is  an  illegal  command,  the  rest  of  the 


{declarations  (see  Appendix  A  for  types): 
x:  alist;  (n t-dictionar y  entries) 
g:  grammar;  (for  rules) 

word:  alfarec;  (with  "wrd"  and  "len"  fields) 
done:  boolean;  } 

{Note  x  has  already  been  initialized,  using 
"lookup,”  to  reference  the  nt  dictionary  entry 
for  the  CP  path} 

word: =  xA.val;  {EXPECTED  NODE  NAME} 
done:=  false; 

g:  =  f indrule  {xA.val)  ;  {RETURNS  RULE  PTR  OR  NIL} 
while  (g  <>  nil)  and  not  done  do 

if  gA.isalternation  then  done:=  true 
else  begin 

x:=  g*. defnA.ntdict ; 

if  x  <>  nil  then  word:=  xA.val  {WHICH  SETS  X 

TO  1ST  DICT  ENTRY} 
else  {RULE  MUST  3E  AN  IDENTITY, 

SO  FIND  SUNGIE  NONTERM} 
word:  =  gA.defnA.  analA.inf  o; 

{EITHER  WAY,  WORD  NOW  HOLDS  NAME  OF  NEXT  RULE 
OR  SET  TO  BE  CHECKED,  SO  UPDATE  G  VALUE  } 

g:=  f indrule  (word)  ; 
end;  {LOOP  BACK  TO  WHI1E} 

{Now,  either  g  points  to  an  alternation  or  is 
nil.  If  nil,  then  word  holds  the  name  of  the 
set  to  use.} 


Figure  4.6  Routine  to  Find  a  Set  or  Alternation 


4)  The  final  commands  to  be  added  depend  on  whether  or  not 
the  CN  accesses  a  "seq"  node.  If  it  does  not: 

a)  If  the  dictionary  entry  has  non-nil  "next”  or  "prev" 
fields,  add  "right"  or  "left"  to  the  command  list; 

b)  If  the  affixes  on  the  dictionary  entry  indicate  an 

optional  child  ("?")  or  a  Kleene  and  if  there  is  no 

child  presently  at  the  end  of  the  CP,  add  "endstr"  to 
the  command  list  (to  end  an  empty  sequence  or  waive  an 
option) ; 


5)  If,  on  the  other  hand,  the  CN  does  access  a  "seq"  node: 


user  input.  This  process  is  repeated  for  each  individual 
command. 

The  list  of  legal  commands  is  dependent  on  one's  loca¬ 
tion  within  the  program  tree.  Note,  however,  that  the 
commands  to  end  the  session  and  change  the  current  depth, 
display  toggle,  or  move  toggle  settings  are  always  offered. 
The  routine  for  determining  additional  commands  is  as 
follows: 

1)  If  the  CN  has  a  parent  value  (which  will  always  be  true 
unless  the  CN  is  the  root  node)  ,  add  "parent”  to  the  list; 

2)  If  the  CP  has  a  program  node  at  its  end,  add  "erase" 
and  "grab"  to  the  command  list.  Further: 

a)  If  the  programnode  at  the  end  of  the  CP  has  a  syntax 
field  pointing  to  a  valid  nonterminal  dictionary  entry 
(and  is  therefore  not  a  "str"  or  "nil"  node),  add 
"child"  and  "change  focus"  to  the  list; 

b)  If  instead  the  CP  points  to  a  "str"  node  (detected 
when  "findset"  returns  a  non-nil  value  for  the  node's 
name),  see  if  the  string  is  closed;  if  it  isn't,  add 
"any  x"  (where  "x"  is  the  set  name)  and  "endstr"  to  the 
list ; 

3)  If,  on  the  other  hand,  the  CP  has  no  node  at  its  end, 
add  "put"  to  the  list  and  follow  the  iterative  routine  in 
Figure  4.6.  This  routine  searches  the  sequence  of  produc¬ 
tions  that  will  be  applied  at  this  point  in  the  program 
tree  until  either  an  alternation  or  a  set  (which  is  really 
an  abbreviation  for  an  alternation)  is  found.  (Note  from 
Section  3.D  that  grammar  design  restrictions  insure  that 
one  of  these  two  conditions  will  always  be  met.)  When  the 
alternation  or  set  is  found,  add  the  appropriate  choices 
to  the  command  list; 


obsolete.  To  combat  the  situation,  the  user  was  forced  to 
repeatedly  alter  the  focus  and  depth  settings  to  position 
the  display  as  desired.  While  one  solution  to  this  problem 
would  have  been  to  display  only  a  constant  number  of 
progr amnodes  on  the  screen,  there  is  no  correlation  between 
the  rule  referenced  by  a  programnode  and  the  amount  of  space 
taken  by  that  rule  in  the  display.  In  the  "Minigol" 
grammar,  for  example,  the  ''type'1  rule  only  requires  the  word 
"real"  or  "integer"  on  a  single  line  and  lets  other  rules 
continue  on  the  same  line,  while  the  "block"  rule  requires 
at  least  two  lines  all  to  itself.  (This  example  demon¬ 
strates  a  common,  known  grammar.  One  can  imagine  that  a 
user-designed  grammar  could  contain  even  greater  deviation 
in  space  requirements.)  A  two-pass  approach  was  therefore 
taken  as  described  in  Chapter  4.  The  trade-off  involved  is 
that  the  two-pass  unparse  is  obviously  less  efficient  than  a 
one-pass  effort.  Indeed,  early  work  showed  that  interac¬ 
tivity  was  noticeably  affected  by  the  delay  between  user 
input  and  SDE  response,  mostly  because  the  SDE  kept  the 
focus  node  (from  which  unparsing  begins)  at  the  root  of  the 
tree  until  the  user  changed  it.  This  meant  that  a  lot  of 
unparsing  was  taking  place  to  display  only  18  lines  or  text. 
Efficiency  was  improved  by  modifying  the  SDE  to  automati¬ 
cally  adjust  the  focus  node  closer  to  the  CP  (by  a  heuristic 
number  of  nodes)  when  appropriate.  This  made  the  two-pass 
approach  acceptably  guick,  and  actually  reduced  programming 
effort  in  other  portions  of  the  SDE  by  keeping  the  focus 
node  above  the  CP  in  the  tree.  Note  that  were  this 
violated,  as  could  occur  when  moving  the  CP  to  the  "parent" 
of  a  sequence  without  also  moving  the  focus,  the  display 
would  not  contain  the  CP  at  all,  and  the  menu  and  display 
would  be  meaningless  to  the  user. 

As  mentioned  in  Chapters  3  and  4,  the  use  of  the 
"command  string"  is  another  trade-off  forced  by  the  use  of 
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Pascal  as  a  programming  language.  Ideally,  the  SDE  should 
act  on  the  program  tree  and  refresh  the  screen  (including 
menu)  with  every  keystroke  the  user  makes;  Pascal,  however, 
does  not  permit  such  an  input  method.  Even  if  it  did, 
however,  the  refresh  rate  of  mcst  terminals  would  make  this 
approach  unacceptably  slow.  Entry  of  a  string  of  commands, 
followed  by  a  single  carriage  return,  offers  faster  editing 
at  the  expense  of  outdated  menus  and  text  displays.  It  is 
offered  as  a  temporary  solution  until  either  terminal  tech¬ 
nology  improves  or  a  signif icantly  different  display  algo¬ 
rithm  is  developed  for  the  SDE. 

Inherent  in  the  requirement  that  the  SDE  be  interactive 
is  that  it  allow  storage  of  one's  work  for  subsequent 
editing.  Unless  useful  to  an  interpreter  or  compiler  (as 
mentioned  in  Chapter  1)  ,  the  parsed  program  file  is  useful 
only  to  the  SDE  itself;  the  user  seeks  the  text  form.  The 
tree  need  never  be  stored,  therefore,  if  the  user  always 
writes  complete  programs  (which  will  never  be  modified)  in 
single  editing  sessions.  Such  a  restriction  on  program 
writing  is,  of  course,  unthinkable.  The  tree  is  therefore 
stored  to  enable  progressive  development  of  programs  by  the 
user.  As  mentioned  in  Chapter  4,  however,  the  SDE's  routine 
to  recreate  a  tree  from  the  stored  form  requires  that  the 
input  grammar  have  unique  tag  fields  in  its  rules.  While 
this  is  an  acceptable  restriction  on  grammars  used  only  by 
the  SDE,  it  may  have  consequences  on  interpreters  or  compi¬ 
lers  seeking  to  act  on  the  same  stored  form. 

Pascal  does  not  permit  storing  of  pointer  values,  so  the 
tree  could  not  be  stored  in  any  direct  manner.  Text  storage 
was  chosen  because  it  facilitated  user  inspection  as  well  as 
tree  recreation.  Inspection  of  the  stored  form  is  hindered, 
however,  by  the  choice  of  "open"  and  "closed"  symbols  at  the 
end  of  strings.  These  symbols  must  be  unprintable  control 
characters  to  prevent  a  user's  input  characters  from  being 
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misinterpreted.  However,  as  control  characters  they  may  be 
displayed  in  unpredictable,  terminal-dependent  ways  —  in 
fact,  they  may  be  interpreted  as  commands  to  the  terminal, 
altering  the  display  itself.  This  situation  may  be  easily 
remedied  for  a  particular  terminal  by  selecting  different 
"open"  and  "closed”  symbols  (as  explained  below),  but  a 
universal  solution  is  lacking. 

A  second  requirement  of  the  SDE  was  that  it  be  table- 
driven.  It  has  been  shown  that  the  SDE  is  capable  of 
supporting  virtually  any  context-free  grammar.  The  restric¬ 
tions  placed  on  grammar  design  in  Appendix  C  limit  only  the 
representation  of  such  grammars  but  do  not  restrict  the 
class  of  grammars  representable  in  this  form.  Appendix  B 
lists  the  format  required  by  the  SDE  to  input  such  grammars. 
Note  that  this  format  is  itself  a  context-free  "grammar"  for 
the  "language"  of  SDE  input  grammars.  The  SDE  can  therefore 
be  used  to  write  grammars  for  itself.  Appendix  B  includes 
the  syntax  of  input  grammars  in  a  format  suitable  for  input 
to  the  SDE.  It  was  used,  in  fact,  to  generate  the  "Minigol" 
grammar  in  Appendix  E.  As  with  any  "program"  written  using 
the  SDE,  the  grammars  in  Appendices  B  and  E  are  the  text 
(unparsed)  forms  created  in  editing  sessions. 

The  input  grammar  format  is  admittedly  awkward  to  the 
human  reader.  Certainly  better  grammar  representations 
exist,  such  as  Backus-Naur  form  and  Argot  £Ref.  20].  It  is 
possible  to  use  the  SDE  to  create  grammars  using  these  more 
desirable  formats  as  follows: 

1)  Write  (by  hand  or  using  the  grammar  in  Appendix  B)  an 
input  grammar  to  the  SDE  that  describes  the  desired 
format.  The  synthesis  parts  of  the  rules  in  this  grammar 
must  exactly  match  the  synthesis  parts  in  the  grammar  at 
Appendix  B  (except  as  discussed  in  a  later  section)  . 

2)  For  every  programming  grammar  to  be  designed: 


a)  Initialize  the  SDE  using  the  product  of  (1)  above  as 
the  input  grammar; 

b)  Write  the  programming  grammar  during  the  editing 
session ; 

c)  Save  the  PARSED  FORM  of  this  programming  grammar  and 
terminate  the  session; 

d)  Re- initialize  the  SDE  using  the  Appendix  B  grammar  as 
the  input  grammar  and  the  parsed  form  saved  in  Step  2c 
above  as  the  input  program; 

e)  Save  the  TEXT  FORM  of  the  tree  just  created  and 
terminate  the  session. 

The  text  grammar  created  in  Step  2e  above  may  subsequently 
be  used  as  an  input  grammar  to  the  SDE  to  create  programs  in 
this  new  grammar. 

The  key  to  the  above  technique  is  that  the  stored  form 
created  in  Step  2c  is  suitable  as  an  input  program  when  the 
SDE  is  using  either  the  grammar  written  in  Step  1  or  the 
grammar  in  Appendix  B.  Thus  the  SDE  is  used  to  translate 
programs  written  in  one  grammar  into  acceptable  programs 
under  another  grammar.  The  SDE’s  translation  capability  is 
explained  further  in  a  subsequent  section. 

Terminal  independence  has  been  achieved  for  the  most 
part.  Use  of  the  "TERM"  file  provides  a  degree  of  informa¬ 
tion  hiding  in  that  the  SDE  achieves  desired  display  effects 
without  any  knowledge  of  how  they  are  implemented. 
Maintaining  a  set  of  files  in  the  "TERM"  format  also  enables 
easy  transition  from  one  terminal  type  to  another.  The 
simplicity  of  the  format  also  facilitates  user-adaptation  of 
new  terminals.  Appendix  F  lists  the  procedure  to  create  a 
"TERM"  file. 
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Terminal  independence  was  not  fully  realized  in  a 
different  sense,  however.  The  three  symbols  "%"  and 

iiiii  were  selected  as  special  formatting  commands  because 
they  are  not  likely  to  be  useful  symbols  to  a  grammar. 
However,  making  them  special  commands  not  only  renders  them 
unavailable  for  any  other  purpose,  but  also  means  that  any 
terminal  lacking  these  keys  can  not  be  used  to  specify 
output  formats  for  new  grammars.  Like  “open"  and  "closed," 
however,  these  symbols  may  be  modified  by  the  user  to  adapt 
to  his  particular  environment  as  explained  below.  Note  that 
the  format  commands  should  be  printable  characters  (unlike 
"open"  and  "closed")  so  that  the  user  can  inspect  his 
grammar  and  effect  modifications.  A  way  to  avoid  the 
unavailability  of  the  selected  keys  would  have  been  to  use 
two-stroke  commands  to  direct  formatting.  The  sequences 
could  be  common  keys  without  making  these  keys  unavailable 
to  the  grammar  designer  for  other  purposes.  "%R"  could 
represent  "carriage  return,"  for  example,  and  both  keys 
could  be  used  in  other  terminal  sequences  (except  as  a 
pair)  . 

The  SDE  was  developed  in  a  Unix  environment  under 
Berkeley  Pascal.  While  the  SDE  is  not  claimed  to  be  written 
in  purely  "standard"  Pascal  (if  indeed  such  a  language 
exists)  ,  the  program  is  designed  to  minimize  taking  advan¬ 
tage  of  its  environment’s  unique  facilities.  It  makes  no 
direct  calls  to  the  operating  system,  although  it  does 
invoke  the  "argc"  and  "argv"  routines  to  obtain  parameters 
from  the  Unix  command  line  (see  Section  2.  A).  These  calls, 
however,  are  purely  for  the  user's  convenience  and  can  be 
removed  from  the  SDE  without  affecting  the  rest  of  the 
program.  In  fact,  the  Unix  environment  actually  hinders  the 
SDE’s  interactivity.  The  reader  has  no  doubt  been  confused 
over  the  selection  of  such  meaningless  keystrokes  as 
control-T  for  the  "child"  command.  Such  keystrokes  were 


selected  only  because  more  meaningful  commands  (such  as 
control-C)  are  intercepted  by  the  Unix  operating  system 
before  being  passed  to  the  SDE;  under  Unix#  control-C  causes 
abortion  of  program  execution.  Indeed,  (capital)  R  for 
"rest  of  sequence"  was  selected  because  all  the  other 
control  characters  either  were  taken  by  the  SDE  or  caused  an 
undesirable  effect  through  Unix.  The  SDE  copes  with  this 
problem  and  at  the  same  time  offers  extreme  modifiability  to 
the  user  by  declaring  these  command  keystrokes  as  variables 
which  are  defined  during  program  initialization.  when  the 
SDE  is  moved  to  another  operating  system,  the  installer  can 
redefine  all  of  the  SDE's  standard  commands  to  whatever  keys 
he  wishes.  He  can  also  redefine  the  "open"  and  "closed" 
symbols  placed  at  the  end  of  strings  as  well  as  the 
carriage-return,  indent  and  out  dent  formatting  commands.  It 
is  recommended  that  all  of  these  keys  (except  the  formatting 
commands)  be  left  as  control  characters,  however,  because 
designation  of  printable  characters  renders  these  characters 
unavailable  for  grammar  or  program  design.  Note  also  that 
changing  the  "open,"  "closed"  or  formatting  commands  between 
sessions  will  render  previous  grammars  or  parsed  forms 
unusable. 

Finally,  the  SDE  was  intended  to  be  as  useful  as  a  text 
editor.  Toward  this  objective,  the  SDE  has  had  mixed 
success.  There  are  many  things  a  syntax-directed  editor  can 
do  that  a  text  editor  can  not.  In  fact,  as  mentioned  below, 
a  syntax-directed  editor  even  offers  certain  text-editing 
facilities  not  found  in  a  text  editor.  However,  the  present 
SDE  also  lacks  certain  features  considered  crucial  to 
successful  program  development.  First,  it  is  slower  than  a 
text  editor  in  that,  in  most  cases,  the  time  required  to 
enter  a  program  under  the  SDE  will  be  greater  than  under  a 
text  editor.  This  is  due  primarily  to  the  command  line, 
which  provides  feedback  only  when  the  display  is  refreshed. 


A  much  more  desirable  facility  would  be  to  exhibit  the 
effect  of  each  keystroke  immediately  on  the  display#  prefer¬ 
ably  without  having  to  redraw  the  entire  screen.  Such  a 
facility#  however#  would  be  extremely  complicated#  may 
require  extensive  terminal-dependent  features,  and  may  even 
require  smarter  terminals  than  presently  available.  Another 
reason  for  the  SDE’s  slowness  is  the  use  of  control 
sequences  to  effect  commands;  use  of  '’arrow"  or  function 
keys  would  be  better,  but  they  are  not  offered  on  all  termi¬ 
nals.  A  third  reason  for  the  SDE’s  slowness  is  inherent  in 
its  one-way  mapping  from  tree  tc  text:  one  achieves  movement 
about  the  display  only  indirectly  by  moving  through  the 
tree.  These  movements  are  often  unintuitive  and  cause  a 
jerky  motion  on  the  display. 

Aside  from  speed#  another  advantage  of  text  editors  over 
the  SDE  is  that  the  user  may  enter  comments  under  a  text 
editor;  no  mention  of  such  a  facility  has  been  made  in  this 
paper.  Comments  were  not  included  in  the  SDE’s  capabilities 
because  it  was  felt  there  are  no  standard  conventions  for 
entering  them.  Some  languages#  such  as  LISP,  FORTRAN  and 
assembly  language,  use  a  delimiter  such  as  or  "C"  to 
indicate  that  the  rest  of  the  line  is  a  comment.  Others 
such  as  Pascal  use  beginning  and  end  delimiters  such  as 
and  "}  "#  and  everything  in  between  these  delimiters  is 
considered  a  comment.  Some  versions  of  PL/I  require 
comments  (enclosed  by  "/*"  and  ”*/")  to  start  and  end  on  the 
same  line.  Additionally#  of  course,  individual  users  will 
have  their  own  preferences  on  how  to  display  their  comments 
within  the  confines  of  these  conventions.  All  of  this 
implies  that  comments  may  not  be  made  a  part  of  the  SDE 
program  but  must  instead  be  indicated  in  the  particular 
grammar  file  being  used.  However#  the  only  facility  the 
input  grammar  format  has  at  present  is  to  include  an 
optional  "comment"  node  in  every  production  in  the  grammar. 


This  would  greatly  increase  the  size  or  the  program  tree  and 
also  slow  down  the  editing  session  (because  the  user  would 
have  to  waive  most  of  these  optional  nodes)  .  One  feasible 
solution  would  be  to  include  a  '‘comments"  command  in  the  set 
of  standard  commands,  but  input  the  details  of  displaying 
such  comments  in  the  grammar  file,  possibly  in  a  special, 
identifiable  production  with  a  different  notation.  This 
approach  would  probably  require  redesign  of  the  programnode 
structure  to  reflect  the  presence  or  absence  of  a  comment  at 
this  location  in  the  tree.  Simply  extending  the  childlist 
would  not  suffice  because  the  SDS  relies  on  the  grammar- 
provided  knowledge  of  how  many  children  to  expect  for  each 
node.  Further,  it  is  doubtful  whether  the  comments  should 
be  kept  as  part  of  the  tree  at  all,  particularly  if  the  tree 
is  to  be  shared  with  other  tools  that  ignore  the  comments 
anyway.  Sandewall  [Eef.  3]  points  out  that  comments  may 
create  memcry  management  problems  (such  as  fragmentation  in 
virtual  memory  systems)  if  the  comments  comprise  a  signifi¬ 
cant  percentage  of  the  text.  &  lore  economical  implementa¬ 
tion  might  be  to  maintain  the  comments  in  a  separate  file 
and  include  index  references  to  this  file  in  the  program 
tree.  However,  such  a  system  might  perform  slowly  when 
required  to  display  comments  in  the  context  of  the  program, 
such  as  in  unparsing  or  pretty printing.  Further  research  on 
implementing  comments  was  not  continued  in  the  SDE  develop¬ 
ment;  at  present  the  SDE  has  no  comment  capability. 

The  SDE  also  lacks  a  "search"  facility.  Whereas  text 
editors  can  perform  an  exact  pattern  match  between  the 
user's  search  string  and  the  text,  there  is  no  such  capa¬ 
bility  when  searching  a  tree  cf  nodes  --  the  text  exists 
only  on  the  screen.  Related  facilities  such  as  "global 
replace"  are  also  .lacking.  There  are  several  possible  solu¬ 
tions  to  this  shortcoming.  First,  the  SDE  can  actually 
unparse  the  entire  tree  into  a  file,  then  read  the  file  and 
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perforin  pattern  matching.  However#  such  a  strategy  would 
only  be  able  to  acknowledge  whether  a  pattern  was  present, 
then  display  this  pattern  and  its  surrounding  strings  on  the 
screen;  the  SDE  would  be  unable  to  relate  this  pattern  to  a 
particular  node  in  the  tree  because  of  the  one-way  nature  of 
mapping  from  tree  to  text,  although  perhaps  a  complicated 
approach  could  effect  this.  The  SDE  would  certainly  be 
unable  to  effect  a  replacement  of  a  random  string  with 
another  because  only  user-defined  names  and  grammar 
templates  may  be  syntactically  changed.  A  better  strategy 
would  be  to  provide  a  limited  search  capability  restricted 
only  to  user-defined  names,  and  perhaps  to  maintain  a  table 
of  such  names  with  pointers  to  all  occurrences  of  them  in 
the  tree.  This  would  provide  rapid  response  to  the  search 
query  and  permit  transactions  (such  as  repositioning  the  CP 
or  changing  the  name)  to  be  made  on  the  tree.  Searches  for 
random  strings  such  as  "for  i  :="  could  not  be  implemented 
this  way,  but  searching  for  the  variable  "i"  would  eventu¬ 
ally  find  the  desired  locations.  A  third  strategy  would  be 
to  enable  the  SDE  to  perform  "parsing"  of  the  search  string, 
then  to  perform  pattern-matching  between  the  tree  segment 
thus  created  and  the  program  tree.  Note,  however,  that  none 
of  these  solutions  offers  as  great  a  facility  as  is  pres¬ 
ently  available  in  text  editors.  On  the  other  hand,  the 
second  and  third  approaches  offer  a  certain  advantage  in 
exchange  for  their  inflexibility  in  that  they  will  retrieve 
only  those  matching  patterns  that  are  also  the  proper  struc¬ 
tures  in  the  grammar.  Whereas  a  text  editor  will  retrieve 
every  occurrence  of  "i"  in  a  program  (including  keywords 
such  as  "is"  and  "in"),  the  above  methods  will  retrieve  only 
the  variables  named  "i. " 
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B.  IMPROVEMENTS  AND  EXTENSIONS  TO  THE  SDE 


like  most  useful  programs,  the  SDE  is  not  a  static 
product.  It  has  evolved  since  the  beginning  of  this 
project,  undergoing  several  significant  improvements.  One 
such  improvement  was  the  use  of  pathnames  to  house  string 
information.  This  modification  not  only  improved  storage 
efficiency  and  eased  grammar  design,  but  it  also  was  imple¬ 
mented  within  the  data  structures  already  established, 
facilitating  the  modification  as  well  as  program 
comprehension. 

The  SDE  is  therefore  a  growing,  changing  product,  and 
certainly  it  is  not  perfect  in  its  present  state.  The 
shortcomings  cited  above  must  be  overcome;  there  are  addi¬ 
tional  modifications  that  should  be  made  to  improve  the 
efficiency  with  which  the  SDE  presently  functions;  and 
further  modification  should  be  performed  to  give  the  SDE 
additional  features  and  capabilities. 

One  efficiency  improvement  that  can  be  made  is  the  SDE's 
implementation  of  seguences.  Sequences  are  presently  imple¬ 
mented  through  chains  of  "seg"  nodes  each  referencing 
exactly  one  item  in  the  sequence.  While  there  are  certain 
advantages  in  knowing  that  each  Mseg"  node  has  exactly  two 
sons,  it  would  be  mere  efficient  to  link  all  items  in  a 
sequence  as  sons  of  the  same  ''seg"  node.  Moving  to  right  or 
left  brothers  or  to  the  parent  would  be  facilitated; 
however,  other  commands  such  as  "rest  seg"  would  have  to 
implemented  differently.  Program  tree  restoration  from  the 
stored  form  would  also  have  to  he  altered,  since  the  number 
of  recursive  calls  to  "readprogram"  would  not  be  known  at 
the  time  the  "seg"  node  was  read. 

Another  improvement  that  can  be  made  in  the  SDE  is  to 
reduce  the  storage  overhead  of  grammar  rules.  For  example, 
rules  presently  use  Pascal  strings  to  describe  nonterminals 
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in  their  analysis,  synthesis,  and  nonterminal  dictionary 
parts.  These  nonterminals  always  reappear  as  names  of  other 
rules.  A  more  efficient  implementation  would  be  to  store 
pointers  to  these  rules  in  the  dictionary  parts  that  refer¬ 
ence  them  rather  than  using  the  names  themselves.  The  over¬ 
head  of  establishing  pointers  would  be  paid  only  once  per 
editing  session  during  initialization.  While  pointers  would 
add  indirection  to  operations  such  as  displaying  nonterminal 
names  from  an  analysis  part,  it  would  eliminate  the  overhead 
of  string-matching  such  as  in  "f indrule. "  Another  such 
improvement  would  be  the  use  cf  integers  instead  of  ten- 
character  strings  to  represent  path  names  in  the  synthesis 
and  nonterminal  dictionary  parts  of  the  rules.  In  fact, 
such  pathnames  could  be  generated  sequentially  by  the  SDE 
for  each  rule;  the  grammar  designer  would  not  have  to  plan 
for  them  or  write  them  at  all.  Use  of  integers  would  not 
only  be  more  efficient  for  the  grammar  designer  and  for 
storage  management,  it  would  also  improve  the  efficiency  of 
looking  down  childlists  to  find  a  match  between  a  given  path 
name  and  a  particular  childnode.  "Lookdn"  could  simply 
traverse  the  childlist  the  appropriate  number  of  childnodes, 
as  determined  by  a  path’s  integer  '’name." 

Another  improvement  to  be  made  in  the  SDE  is  the  storage 
of  user-provided  strings.  While  the  present  implementation 
represents  a  savings  over  a  character-by-character  implemen¬ 
tation,  it  remains  wasteful.  By  making  the  "childnode” 
structure  a  variant  record,  it  could  either  reference  a 
programnode  or  a  new  structure  designed  especially  for 
string  information.  The  "str"  node  and  the  establishment  of 
entire  childnode  records  could  be  avoided.  Another  improve¬ 
ment  to  string  storage  would  be  to  include  a  "length"  field 
in  the  record  itself,  which  would  eliminate  the  present 
delay  in  searching  the  entire  string  to  find  the  "open" 
symbol  when  adding  new  characters,  as  well  as  provide  the 
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unparser's  formatting  mechanism  with  the  length  of  the 
string  more  quickly. 

The  need  for  all  tag  fields  to  be  unique  can  be  elimi¬ 
nated  through  application  of  artificial  intelligence  tech¬ 
niques.  When  restoring  a  tree,  the  SDE  could  determine 
which  synthesis  part  were  referenced  based  on  the  nature  of 
the  attribute- value  children  of  the  tags  in  the  rules. 

Another  obvious  modification  to  improve  the  SDE's  effi¬ 
ciency  is  to  eliminate  the  use  of  two  record  types  in  the 
program  tree.  The  childnode  only  contains  a  pathname,  a 
pointer  to  a  programnode,  and  a  link  to  another  childnode. 
The  programnode  pointer  field  can  be  eliminated  without  loss 
of  information  by  adding  the  other  two  fields  to  the 
programnode  structure  itself.  This  would  save  memory  space 
and  make  the  tree  structure  more  understandable.  Current 
SDE  use  of  the  CN  and  CP  values,  of  course,  would  need 
revision. 

Finally,  the  SDE's  lack  of  "garbage  collection"  should 
be  corrected  to  improve  efficiency  of  memory  use.  While  not 
disposing  of  program  segments  that  have  been  "erased"  by  the 
user  avoids  more  complicated  logic  within  the  SDE j  (for 
example,  in  the  "grab"  function,  which  capitalizes  on  the 
present  implementation)  ,  it  can  be  extremely  wasteful  if  a 
large  amount  of  editing  is  being  performed.  Past  experience 
with  the  SDE  has  been  limited  to  operating  on  a  VAX  11/780 
minicomputer  in  the  editing  of  small  programs,  and  system 
performance  has  not  sufferred  under  the  present  implementa¬ 
tion.  However,  the  SDE  should  be  made  more  economical  to 
insure  acceptable  performance  both  in  editing  larger 
programs  and  in  operating  on  smaller  computers. 

In  addition  to  improvements  which  increase  the  effi¬ 
ciency  of  the  SDE,  there  are  also  some  extensions  to  the  SDE 
that  can  be  made  to  make  it  a  more  useful  product  in  an 
interactive  programming  environment.  For  example,  the  SDE 
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can  be  made  a  more  powerful  tool  by  planning  to  handle 
certain  semantic  information  as  well  as  purely  syntactic 
detail.  For  instance,  most  programming  languages  are 
categorized  into  only  a  few  groups  based  on  the  type  of 
scoping  they  utilize.  SNOBOL,  API,  and  traditional  LISP  use 
dynamic  scoping,  while  PL/I,  FORTRAN  and  the  ALGOL  family  of 
languages  use  static  scoping  [Bef.  17:  p.  38].  If  the  SDE 
were  so  informed,  it  could  perform  type  or  scope  checking 
during  the  editing  session,  and  discover  undeclared  vari¬ 
ables  for  statically  scoped  languages.  Another  example  of 
semantic  help  provided  by  a  syntax-directed  editor  would  be 
the  checking  of  parameter  lists,  dost  languages  pass  param¬ 
eters  either  by  name,  reference,  value,  or  some  combination 
of  these  methods.  Knowing,  for  example,  that  Pascal  imple¬ 
ments  both  "pass  by  value"  and  "pass  by  reference,"  a 
syntax-directed  editor  could  detect  the  passing  of  a  literal 
constant  by  reference  (which  is  a  security  violation  because 
it  risks  altering  the  value  of  the  constant)  and  so  inform 
the  user  [Bef.  21:  p.  221].  It  should  be  noted,  however, 
that  the  more  such  checking  is  "built  in"  to  the  editor,  the 
less  likely  it  is  to  retain  its  generality  as  a  "universal" 
editor. 

One  class  of  languages  for  which  the  SDE  could  provide 
some  very  useful  semantic  information  is  the  set  of  grammars 
it  may  generate.  The  input  grammar  in  Appendix  B  guarantees 
the  creation  of  syntactically  correct  grammars,  but  this 
alone  does  not  insure  that  the  grammar  produced  is  usable. 
It  does  not  insure,  for  example,  that  the  nonterminals  in  an 
analysis  part  agree  in  name  and  affix  with  those  in  the 
synthesis  part  of  a  rule,  or  that  all  such  nonterminals 
appear  as  rules  or  sets  elsewhere  in  the  grammar.  The  SDE 
presently  detects  certain  grammar  errors  during  initializa¬ 
tion,  but  lacks  complete  semantic  checks  and  error  recovery. 
Further,  it  would  be  more  desirable  to  detect  such  errors 
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daring  grammar  creation  rather  than  at  attempted  usage.  To 
be  a  better  tool,  therefore,  the  SDS  should  include  a 
routine  that  insures  grammars  are  correct  as  they  are  being 
created.  The  semantics  of  such  grammars  are  very  simple, 
and  it  is  anticipated  that  such  a  routine  would  be  small, 
efficient,  and  easy  to  implement. 

C.  IMPLICATIONS  OF  SXNTAX-DIRECTED  EDITING 

This  paper  has  presented  an  implementation  of  a  table- 
driven  syntax-directed  editor.  While  such  an  editor  is 
intended  for  use  in  creating  syntactically  correct  programs, 
syntax-directed  editing  offers  capabilities  that  extend  far 
beyond  this  function. 

The  grammar- generating  grammar  in  Appendix  B  offers  a 
rapid  means  with  which  language  designers  can  examine  their 
research  grammars.  When  contemplating  a  new  language,  the 
designer  can  input  his  ideas  into  the  SDE,  then  use  it  to 
quickly  see  a  textual  example  of  his  grammar.  A  grammar 
checker  as  proposed  in  the  previous  section  would  further 
insure  that  his  grammar  was  complete.  Further,  the  SDE  can 
display  programs  written  in  an  incomplete  grammar  (up  to  a 
point),  so  that  the  grammar  may  be  developed  incrementally. 
Finally,  the  designer  may  use  the  SDE  to  create  program 
trees  using  his  grammar,  allowing  him  to  insure  that  the 
desired  semantic  information  can  be  derived  from  his  source 
language.  Lacking  a  syntax-directed  editor,  these  results 
could  only  be  obtained  after  developing  a  scanner  and  parser 
for  his  language  (although  parser-generators  exist  to  help 
in  this  effort)  and  then  writing  complete  programs  in  his 
new  language  using  a  text  editor. 

While  use  of  a  syntax-directed  editor  as  described  above 
facilitates  the  development  of  "traditiona 1"  grammars,  i.e. , 
those  restricted  to  the  class  cf  LR(1)  languages  [Ref.  17; 


95 


THE  DESIGN  IMPLEMENTATION  AND  APPLICATION  OF  A 
TABLE-DRIVEN  5 VNT AX-DIRECTED  EDITOROJ)  NAVAL 
POSTGRADUATE  SCHOOL  MONTEREV  CA  G  M  TILLEV  DEC  84 

F/G  9/2 


UNCLASSIFIED 


NL 


p.  261]/  syntax-directed  editing  implies  a  far  greater  range 
of  possible  programaing  languages  in  the  future.  In  the 
past,  language  design  has  been  limited  to  those  languages 
that  can  be  parsed  effectively.  Syntax-directed  editing 
removes  such  a  limitation  on  grammar  design,  since  the 
editor  is  told  by  the  user  which  production  to  apply  at  a 
given  point  in  the  program.  The  program  tree  is  generated 
directly  during  program  creation,  not  through  parsing  a 
text.  The  implication  is  that  future  languages  may  be 
designed  to  be  more  readable  to  the  human  reader  rather  than 
to  a  parser.  Consider,  for  example,  the  use  of  the  "where" 
clause  in  the  applicative  expression  notation  "L  where  X  = 
M, "  in  which  the  "where"  clause  binds  variables  in  L,  as 
listed  in  X,  to  values  derived  from  the  expression  M 
£Ref.  22:  p.  314]-  This  is  difficult  to  parse  convention¬ 
ally,  since  binding  details  are  provided  after  the  variables 
are  used  and  because  determination  of  which  production  to 
use  requires  "look-ahead"  of  as  many  tokens  as  are  in  L. 
Such  a  notation  presents  no  difficulty  to  a  syntax-directed 
editor,  however,  since  the  user  provides  determinism  by 
directing  the  application  of  the  intended  production.  Note 
also  that  syntax-directed  editing  eliminates  the  need  for 
current  lexical  conventions,  since  no  scanning  need  be 
performed.  Thus,  a  variable  named  "this  is  a  single  vari¬ 
able"  is  perfectly  acceptable  when  input  through  a  syntax- 
directed  editor. 

Even  the  one-dimensional  concept  of  a  program  as  a 
stream  of  characters  is  made  obsolete  by  syntax- directed 
editing:  two-dimensional  languages  may  be  designed,  using 
symbols  difficult  to  parse  traditionally.  For  example,  the 
equations  in  Figure  5.1  are  unacceptable  to  a  traditional 
parser,  yet  can  be  parsed  and  displayed  by  a  syntax-directed 
editor  with  the  proper  formatting  commands.  In  short, 
syntax-directed  editing  opens  an  entire  realm  of  programming 
languages  previously  considered  impossible. 
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Figure  5.1  Examples  of  T wo- Dimensional  Eguations 

Syntax-directed  editors  also  have  an  implication  in  the 
education  of  computer  programmers.  It  would  be  unnecessary, 
for  example,  to  instruct  the  syntax  of  any.  language. 
Whereas  the  Pascal  student  must  currently  know  that  state¬ 
ments  always  end  with  a  semi-colon,  this  need  not  be  known 
by  a  student  using  a  syntax-directed  editor,  since  the 
editor  installs  the  semi-colon  automatically.  The  student 
can  thus  learn  the  semantics  of  the  language  more  readily, 
being  freed  from  syntactic  concerns. 

The  use  of  a  syntax-directed  editor  as  a  translator  has 
already  been  demonstrated  in  Section  5.  A.  There,  the  SDE 
was  used  to  translate  grammars  from  Argot  into  the  input 
format  of  the  SDE.  Translation  between  grammars  is  possible 
when  both  grammars  have  corresponding  rules  with  identical 
tag  names  and  child  patterns  in  their  synthesis  parts.  They 
need  not  be  completely  identical,  however.  Note,  for 
example,  that  there  is  a  slight  difference  between  the  gram¬ 
mars  in  Appendices  5  and  H  in  that  the  second  alternative 
production  of  the  "rule"  alternation  has  a  "+"  affix  on  the 
"choice”  son  in  Appendix  B,  while  the  affix  is  in 


Appendix  H.  Since  both  are  represented  identically  in  a 
tree,  this  difference  is  unimportant.  Note  that  even  the 
word  "choice”  need  not  have  been  duplicated,  so  long  as  the 
new  name  referenced  a  rule  that  had  the  same  tag  name  and 
child  pattern. 

Translation  is  also  possible  between  higher- level 
languages,  although  such  translation  depends  on  the  nature 
of  the  two  languages  themselves.  It  should  be  possible,  for 
example,  to  translate  a  subset  language  program  into  a 
superset  language  program.  Some  translation  within  a 
"family"  of  languages  should  also  be  possible.  Appendix  I 
lists  modifications  to  the  flinigol  grammar  to  make  it 
compatible  with  the  Pascal  subset  grammar  of  Appendix  J, 
which  may  be  perceived  as  a  grammar  for  Pascal  functions  and 
procedures  in  that  it  allows  declarations  of  variables 
before  a  master  "begin-end"  block.  Using  these  two  gram¬ 
mars,  the  Pascal  program  segment  in  Figure  5.2  may  be  trans- 


var 

i:  integer; 
i:  integer; 
begin 

x  *  s  0  * 

while* i  <  10  do 
begin 

3:=  i  *  i: 

x :  =  i+I 
end 


end; 


Figure  5.2  Pascal  Tension  of  Figure  2.2 


lated  into  the  Hinigcl  program  segment  in  Figure  2.2  simply 
by  storing  the  parsed  form  under  Pascal  and  reconstructing 
the  tree  under  Minigol.  The  nature  of  this  translation 
bears  closer  examination. 


The  Pascal  grammar  in  Appendix  J  has  similar  synthesis 
parts  to  those  of  the  Minigol  grammar  in  Appendix  I  with  one 
notable  exception:  the  ,,block2,,  rule.  In  the  Minigol 
grammar,  the  synthesis  part  of  this  rule  creates  a  "block2" 
node  with  paths  to  two  nonterminal  sons,  the  "decl"  sequence 
and  the  "stateblock"  node.  In  the  Pascal  grammar,  however, 
this  synthesis  part  creates  paths  to  a  nonterminal  "state- 
block"  node  and  a  terminal  named  “nil."  This  is  because 
Pascal  blocks  do  not  include  a  "declarations”  part  while 
Minigol  blocks  do.  {For  example,  Minigol  would  allow  decla¬ 
ration  of  additional  variables  above  the  "j:=  i  *  i"  state¬ 
ment.)  The  interesting  relationship  between  these  grammars 
is  that  any  program  written  in  one  may  be  displayed  (or 
translated)  using  the  other  grammar.  In  the  curious  case  of 
translating  a  Minigol  program  (with  declarations  in  its 
blocks)  to  Pascal,  the  declarations  simply  disappear.  This 
occurs  because  the  "readprogram"  routine  in  the  SDE  recur¬ 
sively  constructs  the  tree  from  the  stored  form  based  only 
on  the  expected  number  of  children  for  each  tag  according  to 
the  synthesis  part  that  generated  the  tag.  Since  the  child 
patterns  are  the  same  in  both  grammars,  "readprogram"  reads 
the  correct  number  of  children  and  places  them  correctly  in 
the  tree  when  it  encounters  them.  In  this  particular  case, 
however,  instead  of  reading  a  terminal  named  "nil,"  the 
routine  encounters  a  sequence  of  declarations,  and  since 
declarations  have  their  own  rules  in  the  grammar,  the 
routine  continues  to  process  the  file.  In  other  words,  the 
declarations  are  still  present  in  the  Pascal-constructed 
tree.  The  reason  they  are  never  seen  by  the  user  is  because 
unparsing  is  dictated  by  the  analysis,  not  synthesis,  parts 
of  the  rules.  Thus,  since  the  analysis  part  of  the  "block2" 
rule  in  Pascal  does  not  mention  a  "decl"  sequence,  it  is  not 
listed  in  the  nonterminal  dictionary  for  that  rule,  and  the 
unparser  bypasses  this  child  branch  as  if  it  were  a  terminal 
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of  the  translation.  The  implications  of  this  invisibility 
are  that  the  Pascal  grammar  should  be  treated  as  a  subset  of 
the  Minigol  grammar,  even  though  the  two  are  not  usually 
considered  this  way,  and  that  programs  should  never  be 
translated  into  Pascal  from  Minigol.  Such  a  translation 
would  risk  causing  other  tools  working  directly  on  the  tree 
to  encounter  inappropriate  nodes;  even  tools  acting  on  the 
text  form  would  encounter  difficulty,  since  variables 
declared  in  the  (Minigol)  blocks  would  have  never  been 
declared  in  the  (Pascal)  text  fcrm  of  the  program. 

Still  another  type  of  translation  offered  by  syntax- 
directed  editing  is  translation  between  representations  of 
the  same  language.  [Ref.  23],  for  example,  describes  four 
"alternative  syntactic  forms"  for  a  particular  object- 
oriented  language.  Because  all  four  share  an  indentical 
program  tree  format,  translation  from  one  form  to  another  is 
a  simple  matter  of  reading  the  stored  form  using  the  grammar 
of  the  desired  representation.  Note  that  translation 
between  forms  of  the  same  semantic  language  does  not  involve 
the  problems  mentioned  above.  However,  since  one  of  the 
four  forms  is  two-dimensional,  it  may  require  special 
formatting  commands  to  display  a  text  program  under  this 
grammar. 

All  of  the  above  capabilities  of  syntax-directed  editors 
are  not  reachable  by  standard  text  editors.  Text  editors 
can  not  translate,  check  syntax,  or  do  anything  to  insure 
the  correctness  of  a  program.  Rhat  is  interesting,  however, 
is  that  a  syntax-directed  editor  may  even  offer  text-editing 
capabilities  not  found  in  text  editors.  For  example,  the 
structure  imposed  on  text  by  a  syntax-directed  editor  should 
enable  the  user  to  view  any  text  top-down,  stopping  at  a 
particular  depth,  so  that,  for  example,  only  section  titles 
(but  not  section  contents)  are  displayed.  In  fact,  this 
type  of  structure  serves  to  provide  an  automatic  table  of 
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contents  for  the  document.  Another  application  of  user- 
selected  viewing  is  in  the  use  cf  comments  in  a  program.  It 
should  be  possible  to  structure  comments  into  levels  of 
complexity  so  that  a  reader  unfamiliar  with  the  source  code 
may  view  only  the  high-level  comments  to  get  a  broad  view, 
while  readers  thoroughly  familiar  with  the  project  can  view 
a  more  specific,  detailed  set  of  comments.  A  variation  of 
user-selected  viewing  is  creator-designated  security.  There 
may  be  certain  items  (text,  code,  etc.)  that  are  not  meant 
to  be  viewed  by  certain  classes  of  user.  These  items  need 
not  be  removed  from  the  product,  but  need  only  be  "filtered" 
from  view  by  not  displaying  them. 

Syntax-directed  editing  is  appropriate  for  creating  and 
manipulating  anything  representable  hierarchically.  One 
obvious  candidate  for  such  editing  is  a  hierarchical  data¬ 
base.  It  is  relatively  simple  to  prescribe  a  "grammar"  for 
a  hierarchical • database,  which  may  be  pictured  as  a  collec¬ 
tion  of  trees  whose  roots  may  have  any  number  of  dependents, 
each  of  which  may  have  any  number  of  dependents,  and  so  on 
[Eef.  24:  p.  67].  Appendix  G,  for  example,  lists  an  SDE 
input  grammar  that  produces  a  tree  of  information  about  an 
organization’s  training  course  history  given  the  following 
hierarchical  structure: 

each  course  has  a  number,  title,  description,  list  of 

prerequisites,  and  list  of  current  offerings; 

each  prerequisite  includes  a  course  number  and  title; 

each  offering  includes  a  date  offered,  location,  and 

format,  as  well  as  lists  of  instructors  and  students; 

each  teacher  has  a  number  and  name; 

each  student  has  a  number,  came  and  grade  [fief.  24:  p. 

280  ]. 


Course:  M23  Title:  DYNAMICS  Descr:  .... 

Prereqs: 

Course  #:  M19  Title:  CAICULUS 
Course  # :  M16  Title:  TEIGONOMETBY 
Offerings : 

Date:  750106  Location:  OSLO  Format:  F2 
Date:  741104  Location:  LUBLIN  Format:  F3 
Date:  730813  Location:  MADRID  Format:  F3 
Teachers: 

Emp  i:  421633  Name:  SHARP,  R 
Students: 

Emp  i:  761620  Name:  TALLIS,  T  Grade:  B 
Emp  #:  183009  Name:  GIBBONS,  O  Grade:  A 
Emp  # :  102141  Name:  BYRD,  W  Grade:  B 


Figure  5.3  SDE  Representation  of  Hierarchical  Database 
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A  sample  display  of  a  database  created  using  the  "grammar" 
at  Appendix  G  is  shown  in  Figure  5.3.  Figure  5.4  contains  a 
representation  of  the  database  as  pictured  logically  in 
£Ref.  24:  p.  281].  Not  only  does  one  note  that  both  figures 
represent  the  same  information,  but  one  can  also  imagine  the 
similarity  in  structure  between  Figure  5.4  and  the  tree 
constructed  by  the  SDE  to  represent  the  database.  This 
discussion  does  not  intend  to  suggest  that  a  syntax-directed 
editor  is  capable  of  performing  database  operations.  It 
does  show,  however,  that  such  editors  can  represent  hierar¬ 
chically  structured  data.  It  is  further  suggested  that  such 
editors  might  serve  as  suitable  editing  devices  to  create 
and  modify  databases  represent  able  in  a  particular  hierar¬ 
chical  structure. 

D.  CONCLUSIONS  AID  SUGGESTIONS  FOE  FURTHER  RESEABCH 

The  overall  objective  of  this  paper  has  been  achieved  in 
the  SDE:  a  working  table-driven  syntax-directed  editor  has 
been  implemented,  and  many  lessens  regarding  syntax- directed 
editing  have  been  learned  through  this  implementation.  It 
has  been  shown  that  the  SDE  can  support  virtually  any 
context-free  grammar.  As  a  program,  it  has  been  implemented 
to  be  as  portable  and  flexible  as  possible  through  the  use 
of  variable  command  keys,  the  "TERM"  file,  and  avoidance  of 
system-specific  calls.  In  its  1600  lines  of  Pascal  code,  it 
performs  the  functions  of  both  parser  and  prettyprinter  and 
creates,  stores,  and  retrieves  files.  These  accomplishments 
having  been  noted,  the  SDE  must  now  be  appraised  on  its 
effectiveness  as  a  useful  product. 

The  SDE  is  too  slow  for  actual  program  development.  The 
compromise  of  the  command  line  and  the  need  to  refresh  the 
entire  screen,  rather  than  only  the  edited  portion,  cause  an 
unacceptable  delay  between  user  input  and  program  response. 
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Single  navigation  about  the  tree  is  clumsy  and  unintuitive. 
In  short,  the  user  interface  needs  significant  improvement. 

The  SDE  also  lacks  features  essential  to  a  viable 
editor.  Lack  of  any  search  mechanism  is  a  severe  limita¬ 
tion.  The  absence  of  a  comment  input  mechanism  has  also 
been  cited. 

As  for  display,  the  current  SDE  still  has  problems 
forcing  line  breaks  on  long  display  lines.  When  editing 
grammars,  for  example,  it  occasionally  splits  terminal 
strings  after  the  opening  quote,  with  the  result  that  pret- 
typrinted  forms  are  perceived  to  have  long  strings 
extending  over  two  lines.  Another  problem  is  that  the 
heuristic  movement  of  the  focus  above  tne  current  node,  done 
to  increase  efficiency  and  reduce  the  need  to  change  depth 
and  focus  manually,  causes  “tunnel  vision"  when  editing  the 
last  items  in  a  long  series.  For  example,  attempting  to 
edit  the  last  terminals  and  nonterminals  in  the  rules  in 
Appendix  G  causes  only  the  concerned  rule  (in  fact,  some¬ 
times  only  a  portion  of  the  rule)  to  be  displayed. 

As  a  prototype,  the  SDE  has  done  its  job  in  identifying 
these  shortcomings  and  providing  lessons  from  them  as  well 
as  from  its  successes.  Further,  the  shortcomings  cited 
above  provide  ample  subject  material  for  future  research. 
It  is  felt  that  the  most  significant  potential  for  improve¬ 
ment  to  the  SDE  lies  in  its  interactivity  with  the  user. 
Direct  editing  on  the  CP  rather  than  through  a  command  line, 
accompanied  by  immediate  update  of  text  and  menu  alike, 
would  help  make  the  SDE  a  viable,  useful  editor. 

Aside  from  continuing  the  development  of  the  program 
itself,  the  potential  of  the  SDE  as  it  already  exists  has 
yet  to  be  explored.  It  has  been  suggested  that  the  SDE  has 
certain  translation  capabilities  as  well  as  the  ability  to 
create  an  executable  program  tree.  Much  of  this  ability 
stems  from  its  acceptance  of  grammar-specific  synthesis 


parts  which  allow  both  permutation  of  the  order  of  nonter¬ 
minal  children  and  the  inclusion  of  terminal  children.  The 
effectiveness  of  this  approach  regarding  translation  and 
direct  execution  of  the  program  tree  should  be  researched  to 
assess  its  potential,  and  further  guidelines  for  grammar 
design  should  be  developed. 

This  paper  began  with  a  discussion  of  modern  programming 
environments  and  how  syntax-directed  editors  fit  into  this 
scheme.  As  a  final  research  suggestion,  it  is  recommended 
that  tne  SDE  be  placed  into  such  an  environment,  accompanied 
by  a  set  of  viable  grammars  and  interpretive  tools  that 
could  act  directly  cn  the  trees  produced  by  the  SDE.  In 
this  way  the  utility  of  the  SDE  as  part  of  an  overall 
programming  environment  may  be  assessed. 


APPENDIX  A 

NOTENORTHY  SDE  DATA  TYPES 


const  alf alert  =  10; 
type  alfarec  =  record 

wrd : 

len : 
end; 

{Pointer  and  Record  types  used  in  the  linked  list  of 
Grammar  Rules:  } 


alfa;  {Berkeley's  packed  array 

[1..io  ]  of  char} 

0. . alf alen; 


grammar  =  "'■grec; 
alist  =  Aarec; 
taggedalist  =  Ataggedar ec  ; 
tntlist  =  Atntrec; 
altlist  =  Aaltrec; 
defnptr  =  Adefnrec; 


affixrec  =  record 

al,  a2,  a3:  char; 
end ; 

grec  =  record 

name:  alfarec; 
next:  grammar; 

case  isalternation:  boolean  of 
true:  (alternatives:  altlist)  ; 

false;  (defn:  defnptr); 

end; 

altrec  =  record 

choice:  char: 
choicedefn:  defnptr; 
choicedisplay :  alfa; 
next:  altlist; 
end; 

defnrec  =  record 

anal:  tntlist; 
syn:  taggedalist; 
ntdict:  alist; 
end ; 

tntrec  =  record 

isnt:  boolean; 
info:  alfarec; 
affix:  affixrec; 
next:  tntlist; 
end; 

taggedarec  =  record 

tag;  alfarec: 
avlist:  alist; 
en  d ; 

arec  =  record 

attr:  alfarec; 
val:  alfarec: 
valisnt:  boolean; 
affix:  affixrec; 
next:  alist; 

frev;  alist; 

; 
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[Types  used  in  the  Grammar's  Sets:  } 

setptr  =  Asetrec; 
setrec  =  record 

name:  alfarec; 
members:  set  of  char; 
next:  setptr; 
end; 

[Pointer  and  Record  types  used  in  the  Program  Tree:  } 

nodeptr  =  Aprognode: 
chilaptr  =  Achrldnode; 

prognode  =  record 

name:  alfarec: 
syntax:  defnptr; 
isnt:  boolean: 
parent:  nodeptr; 
childlist:  chilaptr; 
end ; 

childnode  =  record 

path:  alfarec; 
child:  nodeptr; 
next:  childptr; 
end; 

(Record  type  that  maintains  the  Current  Position 
in  the  tree:  } 

currloc  =  record 

cn:  nodeptr; 
cp:  chilaptr; 
end; 

(Types  used  in  the  linked  list  of  Legal  Menu  Choices:  } 

choiceptr  =  Achoicerec; 
choicerec  =  record 

choice:  char; 
descr:  alfa; 
next;  choiceptr; 
end  ; 
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APPENDIX  B 

ALTERNATE  GRAMMAR  FOR  GENERATING  GRAMMARS 


The  following  grammar  represents  Argot,  a  grammar¬ 
describing  format.  It  is  suitable  for  input  to  the  SDE  and 
subseguent  use  as  a  grammar- ge n era  ting  grammar  by  users  who 
prefer  its  format  over  the  syntax  of  the  grammar  in  Appendix 

E. 


gram  mar .( (rule)  +  "STosets:"  (set)*  ) 
(gram  r.  (rule)  ♦  s.  (set)  *  ) 

rule .  A 


le .  A  ( 

"r"^**"  (name)  5  »:  »  (anal)S  »%  =>  »  (syn)?  ) 

(req  n.  (name)S  a.  (anal)  6  s.  (syn)?  )  "regular” 
"a”  ("%5?"  (name)  &  {"  (choice).;  "}  "  ) 

(alt  n.  (name)  &  c.  (choice)  )  "altrnation") 

ime.  (  (char)  +  ) 

(name  c.  (char)  ♦  ) 


choice.  ("%\"  (char)  S  "  ("  (display)  S  ")  :  ”  (anal)  S  "$\\=>  » 
(syn)  ?"!!!") 

(choice  c-  (char)  6  a.  (anal)  6  s.  (syn)  ?  d.  (display)  &  ) 
anal.  ( (tnt^+  ). 


(anal  list,  (tnt)  +  ) 


tnt . A  ( 

Mt»(n*»iMi  (name)&  ”  ) 

(term  n.  (name)  Z  )  "terminal” 

"n"  ("<"  (name)  &  (affix)  &  "  ”  ) 

(nterm  n.  (name)S  a.  (affix)&  )  "nonterm") 


"  (child)*  ) 

node,  (node)  &  c.  (child)  *  ) 


syn.  (  (node)  Z 
(synpart 

node .  ( (name)  Z  ) 

0 

child .(  (path)  &  "  =  "  (cnode) &  ) 

(child  p.(path)&  cn.  (cnode)  S  ) 

path  .^{  (name)  5  ) 
cnode.  (  (tnt)  Z  ) 


af f ix  .  A  ( 

n  £  it  j  tt  tt 
»♦»  (* 

i*i 

It*  ft  Mf  *11 
(*1 

ft  ?  11  Jtf-p  f| 


Mno 

ii  4.  n 

n*  11 


affix" 
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APPEND  11  G 

SAMPLE  GRAMMAS  FOB  A  DATABASE  APPLICATION 


db.  ( (course) +  ) 

(db  c.  (course)  +  ) 


course.  ("^Course#:  "  (crsnum) &  n  Title:  "  (title)  S 

"  Descr:  "  (letter)*  "%\ Preregs :  "  "V"  (prereg)  * 
" !  %”  "Offerings:"  "\"  (offerings)*  "!!£"  ) 

(course  1.  (crsnum)S  2.  (title)  S  5.  (letter)  *  3.(prereg)* 
4 .  (offerings)  *  ) 

prereg.  {"%"  "Course#:  "  (crsnum)S  "  Title:  "  (title)S  ) 
(prereg  1.  (crsnum)  S  2.  (title)  &  ) 

crsnum.  ( (letter)  &  (digit)  S  (digit)' S  ) 

(crsnum  1.  (letter)S  2.  (digit)  &  3.  (digit)  '&  ) 

title.  (  (letter)  +  ) 

(title  1.  (letter)  ♦  ) 

offerings.  ("%Date:  "  (digit)*"  ..".."Location: 


teacher.  ("V  (prsndat)  &  ) 
(teacher  i.(prsndat)& 


prsndat .  ("Emp#  :  "  (digit)*  "  Name:  "  (letter)*  ) 
(prsndat  1.  (digit)*  2.  (letter)  *  ) 

student. ("%"  (prsndat)C  "  Grade:  "  (letter)  &  ) 
(student  i.  (prsndat)  &  2.  (letter)  &  ) 


letter. A3CDEFG HI JKLMNOPQRSTUVHX YZ . ,  . . 
digit. 1234567890.. 


spaces  are  used  by  the  rest  of  the  line.  Rather,  the  second 
integer  represents  the  number  of  lines  on  the  display 
(usually  24  or  16)  ;  the  third  integer  contains  the  number  of 
columns  (usually  80  cr  64)  ;  and  the  fourth  integer  states 
the  line  on  which  the  menu  comnences.  Note  that  this  last 
integer  is  required  because  this  information  is  not  always 
obvious  from  the  sequence  on  the  fourth  line  —  each 
terminal  has  its  own  algorithm  for  moving  the  cursor. 

The  following  figure  represents  the  "TERM”  file  for  the 
VT100  terminal.  Brackets  have  been  added  to  include  com¬ 
ments,  which  would  not  appear  at  all  in  the  actual  "TERM" 
file. 


Pigure  F.1  'Term'  File  for  TT100  Terminal 
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up  space  on  the  display,  and  the  SDE  must  account  for  these 
positions  to  determine  where  tc  force  carriage  returns  on 
long  lines. 

Note  that  the  SDE  uses  these  integer  strings  as  "black 
boxes"  to  achieve  desired  effects.  This  not  only  achieves 
terminal  independence,  but  it  also  allows  the  user  some 
freedom  in  customizing  his  display.  For  example.  Figure  4.1 
showed  one  means  of  displaying  the  CP  on  terminals  without 
inverse  video.  The  user  could  easily  have  used  "<<<"  and 
">>>"  instead  of  and  "  £-"  by  modifying  the  first  and 

second  lines  of  the  "TERM"  file.  Note,  however,  that  it  is 
a  good  idea  to  include  at  least  a  single  space  (ASCII  number 
32)  in  the  "inverse  off"  sequence,  especially  on  those 
terminals  that  do  support  inverse  video.  This  is  because 
the  CP  may  reference  a  "nil"  node,  which  has  no  display  of 
its  own.  Activating  and  inactivating  inverse  video  in 
succession  (without  printing  a  space  in  between  these 
actions)  will  have  no  visible  effect  at  all  on  most  termi¬ 
nals,  and  the  CP  will  seem  to  disappear. 

The  fourth  line  of  integers  moves  the  cursor  to  the 
beginning  of  the  menu  display  on  the  screen.  On  a  24-line 
screen,  this  should  be  the  eighteenth  line,  first  column. 
This  position  affords  a  sufficient  amount  of  space  for  the 
menu  while  leaving  most  of  the  screen  for  the  program 
display.  However,  in  grammars  with  very  long  alternation 
lists,  or  on  a  screen  where  only  three  commands  will  fit  on 
a  single  line,  the  menu  must  begin  higher  on  the  screen  to 
prevent  scrolling.  Customization  of  the  menu  for  the 
terminal  and  particular  grammar  in  use  is  a  refining 
process. 

The  final  line  of  the  "TERN"  file  provides  information 
on  the  dimensions  of  the  screen.  Like  the  other  lines,  the 
first  integer  tells  the  SDE  how  many  integers  follow; 
however,  the  line  contains  no  integer  describing  how  many 


APPENDIX  f 

DESCEIPTION  OF  TEE  "TERM"  FILE 


The  "TEEM"  file  required  by  the  SDE  during  initializa¬ 
tion  is  a  text  file  consisting  cf  5  lines  of  integers.  The 
first  four  lines  in  the  file  contain  information  about  a 
particular  terminal’s  commands  to  activate  and  inactivate 
inverse  video,  clear  the  screen  and  move  the  cursor  to  the 
beginning  of  the  SDE  menu  display  on  the  screen.  The  fifth 
line  contains  dimensional  information  about  the  display 
area . 

The  information  in  the  first  four  lines  follows  the  same 
pattern.  The  first  integer  on  each  line  represents  the 
number  of  integers  that  follow  on  that  line.  (This  tells 
the  SDE  how  many  integers  are  to  be  read  on  the  line.  Since 
the  SDE  reads  the  information  as  integers,  it  can  not  detect 
ends  of  lines  without  this  knowledge.)  The  second  integer 
tells  the  SDE  how  many  spaces  on  the  screen  will  be  physi¬ 
cally  occupied  by  the  rest  of  the  sequence  on  that  line. 
The  remaining  integers  represent  a  sequence  of  characters 
that  will  cause  the  terminal  to  perform  as  desired. 
Conversion  of  the  integers  to  characters  is  done  using  the 
Pascal  "chr"  function;  when  using  the  ASCII  character  set, 
therefore,  the  number  "65"  represents  the  character  "A", 
while  the  number  "17"  represents  the  character  " control-W". 

The  purpose  of  the  second  integer  on  each  of  the  first 
four  lines  is  to  inform  the  unparser  how  many  columns  to 
count  when  invoking  the  sequence.  Usually,  activating 
inverse  video  causes  no  spaces  to  be  used  on  the  screen  for 
those  terminals  that  support  it.  However,  for  terminals 
that  do  not  support  inverse  video,  a  printable  sequence  of 
characters  must  be  used  instead.  These  characters  will  take 
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"trm  *  fctr" 


"/"  ( 
"  ("  ( 
"#"  ( 
"v"  ( 


mull  1.  (term) &  2.  (factor)S 
term)S  ”  /  "  ^factor) S  ) 


) 


div  1  1 .  (term)  i 
("  (exp)  S  ")'» 
expf  e.  (exp)  5 
number) &  ) 

)  "number" 
var) 6  1 
)  "variable") 


1 


(factor ) 5  )  "trm  / 
"  (exp)  " 


term. A 


"♦"(term)  5  »  ♦  »  (factor)  &  ) 

mul2  1. (term) S  2. (factor)&  )  "trm 
"/"((term)  6  "  /  »  (factor) &  ) 

div2  1.  (term)  &  2.  (factor)&  )  "trm 
..("(.)("  (exp)  &  ")  "  ) 

(expZ  e.  (exp)  S  )  "  (exp)  " 

»#"  ((number)  5  ) 

)  "number" 

"v"  (  var)  &  l 

()  "variable") 

factor. A ( 

"("rr  (exp)  6  ")"1 
f  e.  (exp)  5  )  " 

"#"(  i  number)  &  ) 

)  "number" 

»v"  (  var)  6  ) 

()  "variable") 

var.  ^(id)&  ) 

id. ( (?har) ♦  ) 

(id  n. (char)  +  ) 


(exp)  " 


number.  ( 

(num  va 


it)+  ) 

.  (digit)  +  ) 


char . abcdefghi jklmnopgrstuvwxyz_. . 
digit. 0123456789.. 


fctr" 


*  fctr" 
/  fctr" 
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block .  ("SNbegin"  (decl)*  (statement).;  "Send!"  ) 
(block  head,  (decl)*  body .  (statement) .  ;  ) 


decl 


.  ("S\"  (typeis  (id)  6  »!"  ) 
(decl  n.(ia)&  t.  (type)S  ) 


type. A  ( 

"n"  ("integer  "  ) 

(int  )  "integer" 

"r"  ('’real  "  ) 

(real  )  "real") 

statement. A ( 

"a"  (  (assign)  &  ) 

)  "assignment" 

"v"  (  w hileloop)  &  ) 

Q  "while  loop" 

"b"(  block)  &  ) 

1  "block’* 

"i"  (  ifstat)  &  ) 

()  "if  statmnt") 

assign.  ("S\"  (var)  &  ":=  "  (exp)  S  »!  «  ) 

(asn  d.  (var)  &  s.  (exp)  &  ) 

ifstat.  ("S\if  "  (relation)  6  "  then"  (statement)  6 
(elsepart)?  "!"  ) 

(if  cona.  (relation)  &  conseg .  (statement)  S 
alt.  (elsepart)?  ) 

elsepart. ("Seise"  (statement)  &  ) 

(else  s.  (statement)  S  ) 

whileloop.  ("SNwhile  "  (relation)S  "  do"  (statement) 6  "!"  ) 
(while  cond.  (relation)  6  body,  (statement)  &  ) 

relation.  (  (exp)  S  (relop)  &  (exp)  l&  ) 


(rein  op.  (re lop)  5  i.(exp) 


(exp)  '&  ) 


relop. A  ( 

if  —  n  mi  =  »«  ) 

(eg  )  ’»=" 

"n"  (•’  <>  »  ) 

(ne  }  "not  =» 

"<»  ("  <  ’•  ) 

(It  )  "<" 

»>"  ('’>’') 

(gt  )  ">" 

"1  "  ("  <«  "  ) 

(le  )  "<=" 

Itgll  Ml  >=*  fl  \ 

(ge  )  ">  =" ) 

+  "  ( (exp) &  "  ♦  "  (term)  5 

ada  1.(exp)5  2.  (term)  &  ) 
»-"  (  exp)  5  "  -  ’•  (term)  5 

(sub  1.{exp)S  2.  (term,  &  ) 
"♦"((term)  5  "  *  "  (factor)  &  ) 


)  "exp  ♦  trm" 
)  "exp  -  trm" 


APPENDIX  d 

INPUT  GRAMMAR  FOR  EXAMPLE  IN  CHAPTER  3 


sample.  { (A)  6  ) 

(sample  a.  (A)  &  ) 


A.  A  ( 

••  -fit 


al  t!  (T)  &  )  "A  —  >  T" 

"2  ”  (  A)S  ”  ♦  ”  (T)  6  ) 

,a2  opndl.  (A)5  opnd2.  (T)  S  oprtr.  ’’add'1  )  ”A-->  A  +  T") 

T  A  ( 

*  "  1  "  ( (F)  5  ) 

tl  f.  (F)  &  )  ”T  — >  F" 

"2"  l  T)&  ”  *  "  (Flfi  ) 

t2  opndl.  (T)&  cpnd2.  (F)  5  oprtr.”mpy»  )  ”T — >  T  *  F") 


F  A  ( 

*"T"  ( (char)  &  } 

(fl  c.  (char)  S  }  "F  — >  char” 
”2"  (”(”  (A)  &  ")»  ) 

(r 2  a.  (A)  &  )  ”F  — >  (A)  " 


(A)  ») 


char . qrs. . 


c)  every  nonterminal  name  in  analysis  and  synthesis 
parts  must  either  be  a  rule  name  or  a  set  name. 

3)  Every  non-identity  regular  rule  must  have  at  least  one 
nonterminal  in  its  analysis  and  synthesis  parts. 

4)  The  first  rule  in  the  grammar  must  be  a  non-identity 
regular  rule. 

5)  Individual  choices  of  an  alternation  may: 

a)  be  similar  to  non-identity  regular  rules  in  that 
their  synthesis  parts  may  include  a  tag  and  at  least  one 
nonterminal  son; 

b)  have  synthesis  parts  that  have  only  a  tag  (and  no 
children) ; 

c)  have  synthesis  parts  with  tags  and  terminal  children 
only ; 

d)  be  identities  that  reference  rules  as  described  in  a, 
b,  or  c  above.  (Note  that  the  end  rules  would  meet  the 
"regular  rule"  syntax  —  but  those  in  b  and  c  above  must 
not  be  referenced  as  regular  rules  elsewhere  in  the 
grammar,  since  they  violate  "regular  rule"  semantics.) 
Note  that  identities  may  also  reference  other  identities 
to  form  a  "chain"  leading  tc  a  rule  as  described  in  this 
section. 

6)  Choices  of  alternations,  therefore,  must  NOT  be  sets, 
other  alternations,  or  identities  leading  to  sets  or  alter¬ 
nations. 


APPEND II  C 

SEMANTIC  RESTRICTIONS  OF  GRAMMAR  DESIGN 

The  following  rules  apply  to  grammars  to  be  used  as 
input  to  the  SDE.  The  syntax  of  such  grammars  may  be  found 
in  Appendix  B. 

1)  For  a  given  regular  rule  or  alternation  choice: 

a)  every  nonterminal  in  the  analysis  part  oust  also 
appear  as  a  child  in  the  synthesis  part,  and  vice  versa. 
The  nonterminals  must  match  identically,  including  their 
affixes.  (The  only  exception  to  this  rule  is  the  "iden- 
tity,"  which  consists  of  exactly  one  nonterminal  in  the 
analysis  part,  and  no  synthesis  part  at  all.)  The  order 
of  the  nonterminals  in  the  analysis  part  is  independent 
of  their  order  in  the  synthesis  part; 

b)  the  synthesis  part  may  have  additional  terminal  chil¬ 
dren,  and  the  analysis  part  may  have  additional  terminal 
strings  —  these  are  not  related  to  each  other  at  all; 

c)  the  path  name  of  each  child  for  a  given  synthesis 
part  must  be  unigue; 

d)  each  nonterminal  in  the  analysis  part  (and  synthesis 
part)  must  be  unigue.  If  more  than  one  appears,  they 
are  to  be  distinguished  by  their  affixes:  the  first 
affix  of  all  but  one  (or  all  of  them,  if  preferred)  will 
be  a  unigue  symbol  selected  from  the  set  of  digits  and 
the  prime  (”  ”'). 

2)  Across  the  entire  grammar: 

a)  every  rule  and  set  name  must  be  unigue; 

b)  the  tag  name  of  every  synthesis  part  must  be  unigue; 
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"single  nt" 
"Kleene  +" 
"Kleene  *" 


affix .A  ( 
»&"  (»•£"  ) 
i 


affix2.  A  ( 
11511  in  gif 

(6  2 

n  +  11  (if +11 

11*11  (i  W, 

1*2 
11  on  (if->»i 

(?2 

II  m  II  (If  _  II 


"single  nt" 

"Kleene  +" 

"Kleene  *" 

,  "optional" 
(char)  6  ) 


(list2  c.  (char) 6  )  "list" 


display.  ( (char)  ♦  ) 

(display  c.  (char)+  ) 

set.  ("%%"  (name)  6  ".  "  (char)  +  " 
(set  n.  (name)  6  c  .  (char)  ♦  ) 


") 


"digit" 


ch ar . abcdef qhijklmno 
1234567890  ,.<>/?: 


digit.  0123456789.. 


the  above  symbols  {excluding  the  digits  and  "'")  .  Also,  if 
the  first  character  {or  second,  if  the  first  was  a  digit  or 
"'")  is  a  a  second  {or  third)  character  is  added.  This 

character  may  be  any  printable  character  at  all. 

A  set  consists  of  a  name  and  a  string  of  characters. 
The  name  is  separated  from  the  string  by  a  period.  Note 
that  this  string  is  not  limited  to  10  characters  in  length; 
in  fact,  it  may  extend  over  several  lines.  The  string  is 
ended  by  a  double  period.  (A  single  period  means  the  symbol 
is  to  be  added  to  the  string.) 

The  following  is  a  grammar  demonstrating  the  above 
syntax.  It  is  also  the  grammar  used  by  the  SDE  to  create 
"programs”  in  this  syntax,  and  is  therefore  self- describing 
as  well  as  suitable  for  input  tc  the  SDE : 


grammar.  ( (rule)  +  (set)*  ) 

(gram  r.(rule)+  s.  (set)  *  ) 

rule. A ( 

"r»  (•»%%"  (name)  &  (anal)S  "%W("  fsyn}?  »)  !  !"  ) 

(reg  n.  (name)&  a.  (anal)  &  s.  (syn)  7  )  "regular" 
"a"  C'SS"  (name)  &  ».A{'»  (choice)*  *)"  ) 

(alt  n.(name)&  c.  (choice)  ♦  )  "altrna tion") 


name.  ( (char) +  ) 

(name  c.  (char)  +  ) 


choice.  ("%\"""  (char)S  """"  (anal)  &  "SV\("  (sy 
(display)  &  »•""!  I!"  ) 

(choice  c.  (char)  6  a.  (anal)  &  s.  (syn)  ?  d.  (di 

anal .  ("  ("  (tnt)  ♦  ")»  ) 

(anal  list,  (tnt)  +  ) 


(syn)?  ")  """ 
(display)  S  ) 


tnt.  A 


t.  A  ( 

ii tn /mi mi  (name)  S  """  "  ) 
(term  n.  (name)  &  )  "t 


(term  n.  (na 
"n"  ("  ("  (name)  & 


I  "terminal" 
(affix)  &  "  ») 


(nterm  n.  (name)C  a.  (affix)S 


1  "n 


onterm") 


syn.  ((node)S  "  "  (child)*  ) 

(synpart  node,  (node)  5  c.  (child)*  ) 

node. j  (name) &  ) 

child.  ( (path) S  (cnode) &  ) 

(child  p.  (path)#  cn.  (cnode)  S  ) 

path .  ( (name)  5  ) 
cnode. ^{  (tnt)  5  ) 
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Alternation  rules  are  composed  of  a  name  and  one  or  more 
choices.  The  name  is  separated  from  the  rest  of  the  rule  by 
a  period  followed  by  an  "A"  to  distinguish  the  alternation 
from  a  regular  rule.  Further,  the  e  tire  set  of  choices  is 
enclosed  in  a  set  of  parentheses.  Each  choice  is  composed 
of  a  single  character  (used  to  select  the  choice) ,  analysis 
and  synthesis  parts,  and  a  display  string  describing  this 
choice  in  the  menu.  As  above,  the  analysis  and  synthesis 
parts  are  each  enclosed  by  their  own  set  of  parentheses. 
The  choice  character  and  display  string  are  enclosed  by 
quotes. 

An  analysis  part  is  a  sequence  of  terminals  and  nonter¬ 
minals.  Terminals  are  strings  delimited  by  quotes.  If  a 
quote  is  to  be  part  of  a  terminal,  it  is  represented  by  two 
consecutive  quotes,  i.e.,  Nonterminals  are  names  delim¬ 

ited  by  parentheses  and  followed  by  an  affix  as  described 
below . 

Synthesis  parts  consist  of  a  node  name  followed  by  zero 
or  more  children.  Each  child  is  a  path  name  and  a  child- 
node,  which  are  separated  from  each  other  by  a  period.  A 
childnode  is  either  a  terminal  or  nonterminal  as  described 
above.  Note  that  the  node  name  in  a  synthesis  part  is 
always  followed  by  a  space.  Thus,  even  if  the  node  has  no 
children,  a  space  will  separate  the  name  of  the  node  from 
the  closing  parenthesis  of  the  synthesis  part.  Note  also 
that  synthesis  parts  are  entirely  optional  in  any  rule 
except  the  very  first  in  the  grammar.  However,  the  opening 
and  closing  parentheses  of  the  synthesis  part  must  always  be 
present. 

An  affix  on  a  nonterminal  in  the  analysis  and  synthesis 
parts  of  a  rule  may  consist  of  up  to  three  characters.  The 
first  character  may  be  a  or 

any  of  the  10  digits.  If  the  character  is  a  digit  or 
symbol,  a  second  character  is  included  which  may  be  any  of 
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APPENDIX  B 

DESCRIPTION  OF  INPOT  GRAMMARS  TO  THE  SDE 


The  following  paragraphs  describe  the  format  of  a 
grammar  suitable  for  input  to  the  SDE.  At  the  end  of  the 
appendix  is  the  input  grammar  used  to  generate  other  gram¬ 
mars,  which  may  be  considered  "programs"  in  the  "language" 
of  grammars.  Because  the  listing  is  suitable  for  input  to 
the  SDE,  it  both  demonstrates  and  restates  the  prose 
description  below. 

Certain  semantic  details  should  be  noted  at  the  outset. 
All  strings  described  below  must  be  no  more  than  10  charac¬ 
ters  in  length.  This  is  not  checked  by  the  grammar- 
generating  grammar.  A  string  of  more  than  10  characters 
will  cause  the  SDE  to  print  a  warning  message  during 
initialization,  and  the  SDE  will  truncate  the  string  to  10 
characters.  Other  special  considerations  involve  the  use  of 
double  quotes  and  double  periods  in  place  of  single  charac¬ 
ters,  as  described  below.  Further  semantic  requirements  are 
listed  in  Appendix  C.  Note  also  that,  except  where  speci¬ 
fied  otherwise,  items  in  a  grammar  are  separated  by  a  space. 

An  input  grammar  is  a  sequence  of  one  or  more  rules 
followed  by  a  sequence  of  zero  or  more  sets.  The  sets  are 
separated  from  the  rules  by  a  colon  on  a  separate 
line.  The  first  rule  must  be  a  regular  rule  with  a  non-null 
synthesis  part,  as  described  below. 

A  rule  is  either  a  '’regular’1  rule  or  an  "alternation." 
A  regular  rule  is  composed  of  a  name,  an  analysis  part  and  a 
synthesis  part.  The  name  is  a  string  (of  up  to  10  charac¬ 
ters)  and  is  followed  by  a  period  ("."}  to  delimit  it  from 
the  rest  of  the  rule.  The  analysis  and  synthesis  parts  are 
each  enclosed  by  a  set  of  parentheses. 
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(O']  )  II  OH 

iiti.  (ilin  (affix 2)  &  ) 

prime  2.(affix2)&  )  "prime" 
(  char)  S  ». ..  "  ) 

(list  c.  (char)  t  )  "list") 


char .ABCDEFGHIJKLMNOP^ESTOYWXYZ  abed efghijklmnopgrstuvvxyz 

As  stated,  the  above  grammar  is  suitable  for  designing 
grammars  in  a  different  format  from  that  of  Appendix  B.  An 
example  of  the  format  described  above  is  the  following, 
which  is  the  same  grammar  in  the  new  format. 


grammar:  <rule>+  "*?!sets: "  <set>* 

=>  gram:  r=<rule>  +  s=<set>*  . 

rule:  [ 

r (regular):  "*%"  <name>  " :  "  <anal>  "%  =>  "  <syn>? 

=>  reg:  n=<name>  a=<anal>  s=<syn>?  ; 
a  (altrnation)  :  "55*"  <name>  ("  <choice>;... 

=>  alt:  n=<name>  c=<choice> ; . . .  } 

name:  <char>+ 

=  >  name:  c=<char>  +  . 


choice:  "5&\"  <char> 
<syn>?  " ! !  1 " 

=>  choice:  c=<char> 


"("  <display >  ")  :  "  <anal>  »%\\=>  » 

a=<anal>  s=<syn>?  d=<display> 


anal:  <tnt>+ 

=>  anal:  list=<tnt>+  . 


tnt :  { 

t  (terminal)  :  """"  <name>  """  " 

->  term:  n=<name>  ; 
n (nonterm) :  "<"  <name>  ">"  <affix> 
=>  nterm:  n=<name>  a=<affix>  } 


syn:  <node>  ":  "  <child>* 

=>  synpart:  node=<node>  c=<child>*  . 


node:  <name> 
=>  . 
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child:  <path>  M  ="  <cnode> 

=>  child:  p=<path>  cn=<cnode> 

path:  <name> 


cnode:  <tnt> 
=>  . 


affix:  { 

& (no  affix) :  "  " 

=>  SI:  : 

+ (+) : 

=>  +  1 :  ; 

*  [*)  :  »*»• 

=>  *1:  ; 

?  (?) :  »?•» 

=>  ?1:  • 

'  (prime)  :*"*  "  <affix2> 

=>  prime:  2=<affix2> 
.(list):  <char> 

=>  list:  c=<char>  } 


it  it 


affix2:  { 

&  (no  affix) : 

=>  &2:  ; 

♦(+)  :  ' "+" 

=>  +2:  ; 

*  /*)  .  ii*n 

=>  *2:  ; 

? (?) :  "?" 

=>  ?2 :  • 

.(list):  <cfiar> 

=>  list:  c=<char> 


display:  <char» 

=>  dsply:  c=<char>+  . 

set:  "V  <name>  "  =  (”  <char>, 
=>  set:  n=<name>  c=<char>,... 


ii  j  it 


sets : 

char  =  f  ABCDEFGHIJKLHNOPQSSTUVUXYZabcdef  ghiiklmnOj,  grstuvvxyz 
1234567890  ,  .<>/?:  y  |  5) # l_- +=  !\S] 
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APPENDIX  I 

MIHIGOL  GRAMMAH  MODIFIED  FCB  PASCAL  COMPATIBILITY 

The  following  is  a  modification  of  the  "Minigol"  grammar 
presented  in  Appendix  E.  The  purpose  of  the  modification  is 
to  make  minigol  compatible  with  the  Pascal  grammar  in 
Appendix  J.  See  Section  5.C  for  the  implications  of  this 
compatibility. 


block.  ( "SNbegin"  (decl)?  (stateblock)  &  "Send!"  ) 
(block  head,  (decl)?  body,  (s tateblock)  S  ) 

stateblock.  { (statement) .  ;  ) 

(stateblock  s. (statement) . ;  ) 

decl. ( (decl2) +  ) 

(decl  d.  (decl2)  +  ) 

decl  2.  ("%\"  (type)  &  (id)  6  »!»  ) 

(decl2  n.  (ia)&  t.(type)S  ) 

type. A  ( 

"n"  ("integer  "  ) 

(int  f  "integer" 

"r"  (‘'real  "  ) 

(real  )  "real") 

statement.  A  ( 

"a"  ( (assign)  S  ) 

\)  "assignment" 

"w"  { i  whileloop)  &  ) 

)  "while  loop" 

»b»(  block2)&  ) 

)  "block" 

"i"  (  ifstat)  &  ) 

()  "if  starmnt") 

block2.  ("SNbegin"  (decl)*  (stateblock) S  "Send!"  ) 
(block2  head. (decl)*  body,  (stateblock)  &  ) 

assign.  ("S\"  (var)  &  ":=  "  (exp)  S  "I  "  ) 

(asn  d.  (var)&  s.  (exp)  &  ) 


if  sta 


at.  ("%\if  "  (relation)  6  "  then"  (statement) 
(else part)  ?  "!  "  ) 

(if  cond.  (relation)  &  conseg .  (statement)  6 
alt.  (elsepart)  ?  ) 


elsepart. ("Seise"  (statement) &  ) 

(else  s.  (statement)  S  ) 

whileloop.  ("%\while  "  (relation)S  "  do"  (statement)S  "!"  ) 
(while  cond.  (relation)  &  body,  (statement)  £  ) 

relation.  ( (exp)  &  (relop)  6  (exp)’B  ) 

(rel  op.  (relop)  &  1.  (exp)  &  2  .  (exp)  *&  ) 


'•  A  A*  .*•*.*• 


S  *.  .‘.  A  , 


•*>"("  >  ") 

(gt  )  '*>" 

11^  II  (If  <=  II) 

(le  )  »•<=" 

"q"  {"  >=  «) 

(ge  )  <’>=«] 
f  (exp)  £  "  + 


exp)  £  n  +  "  (term)  £ 
add  1.  (exp)&  2.  (tern 


{add  1.(exp)£  2.  (term)  £  )  "exp  ♦  trm" 
"-"({exj>)£  A  -  ft  (term)5  .  „  .. 


sub  1.{exp)S  2.  (term)  S  )  "exp  -  trm" 
"*"(iterm)£  "  *  "  (factor)  S  ) 

mull  1.  (term)  5  2.  (factor)S  )  "trm  *  fctr" 
"/"(term)  5  «  /  "  (factor)  S  ) 

divl  1.  (term)  S  2.(factor)£  )  "trm  /  fctr" 
•i  ("("(»  (exp)  £  ")"  ) 

...iiaMi?1' 1  ”ieip,n 


"#"  ((number)  S  ) 

)  "number" 

"  v"  (  var)  £  ) 

()  "variable") 

term . A 

"*"(  term)£  "  *  "  (factor)  £  ) 

mul2  1.  (term)  £  2.(factor)£ 
»/"(  term)£  "  /  »  (factor)  £  ) 

div2  1.  (term)  S  2.  (factor)£ 
..("(.»(»  (exp)  £  ")  "  ) 

(expi  ef  (exP)£  )  "(exp)" 

"#"  (  number)  £  ) 

)  "numoer" 

"v"  (  var)  £  ) 

)  "variable") 

factor.  A  { 

•»  ("("(»>  (exP)£  ")"  ) 

?  ’  (e*P) 

"#"(  (number)  £  ) 

()  "number" 

"v"((var)£  ) 

()  "variable") 

var.  ^(id)£  ) 

id.  (  (char)  ♦  ) 

(id  n. (char)  +  ) 

number.  ( (digit)  +  ) 

(num  val.  (digit)  +  ) 


)  "trm  *  fctr" 
)  "trm  /  fctr" 


char . abcdef ghi jklmnopgrstuvvxyz_. . 
digit. 0123456789.. 


APPENDIX  J 

PASCAL  SUBSET  GRAMMAR 


block.  (»S\»  (decl)  ?  (stateblock  )S  ) 

(block  d.(decl)?  body .  (stateblock)  &  ) 

stateblock.  ("5S\begin"  (statement).;  "Stendl"  ) 
(stateblock  body .  (statement) . ;  ) 

decl.  ("%\var\"  (decl2)+  "!!"  ) 

(decl  d.  (decl2)  +  ) 

decl 2.  ("3"  (id)  &  "s  "  (type)S 
(decl2  n.  (id)  &  t.(type)6  ) 

type. A ( 

"n"  (''integer"  ) 

(int  )  "intt 

Wrll  Ji’rpa  1  «  \ 


5  ) 


teger" 

■r"  ('’real"  ) 

(real  )  "real") 


statement.  A  ( 

"a"  ( (assign)  S  ) 

()  "assignment" 
"w"  (  whileloop)&  ) 

)  "while  loop" 
"b"  (i  block2)  &  ) 

)  "block" 

"i"  ( i  ifstat)  S  ) 

|)  "if  statmnt") 


block 


c2.  ( (stateblock)  5  ) 
(block2  x. "nil"  bca 


y. (stateblock) &  ) 


assicjn.  ("%\",  (var)  5  »;:=  "^{expJS  "1"  ) 


[asn  d.  (var)  &  s.  (exp)  &  ) 

ifstat.  ("?E\if  "  (relation)  S  "  then"  (statement)  S 
(elsepart)?  "1"  ) 
ond.  (relation)  &  cons« 


(if  cond.  (relation) 
alt.  (elsepart)  ?  } 


iseg .  (statement)  & 


elsepart.  ("%else|*  (statement)  S  ) 


else  s.  (statement)  &  ) 

whileloop. ("%\while  "  (relation)S  "  do"  (statement) &  "!"  ) 
(while  cond.  (relation)  &  body,  (statement)  S  ) 

relation.  ( (exp)  &  (relcp)  S  (exp)  '&  ) 

(rel  op.  (relop)  S  if  (exp)  &  2 .  (exp)  *S  ) 

relop. A  { 

ii  -  n  /ii  -  n  j 

(eg  )  ’•=" 

»n"  ("  o  ") 

(ne  )  "not  =" 

"<"  ("  <  «) 

(It  )  »•<" 

">"  ("  >  »'} 

(gt  )  ">" 

"1"  ('■  <=  ••) 

(le  )  *'<=" 
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»g"("  >=  ") 

(ge  )  <•>=") 

exp. A  ( 

((exp)  6  "  ♦  "  (tera)S 
add  1.  (exp)  &  2.  (term'  6  )  "exp  +  trm" 
(  ;exp)6  "  -  »  (term)  & 
sub  1.(exp)&  2.  (term'  S  )  "exp  -  trm" 
»*"  ( (term)  5  »  *  »  (factor)  6  ) 

mull  1.  (term)  &  2.(factor)&  )  "trm  * 
"/"  (  term) &  «  /  "  (factor) &  ) 

divl  1.  (term)  5  2.(factor)&  )  "trm  / 
"("("("  (exp)  &  ")  ”  ) 

n  "number" 

"v"  (  var)  6  ) 

()  "variable") 


f  ctr  " 
f  ctr" 


term. A 


"*"((term)S  "  *  "  (factor)  6  ) 

mul2  1.  (term) &  2. (factor)&  )  "trm  *  fctr" 
"/"  ( (term)  &  «  /  "  (factor)  &  ) 

div2  1.  (term)  S  2.  (factor)S  )  "trm  /  fctr" 
"("(•»("  (exp)  &  »)"  ) 


"#"  (/number)  &  ) 

)  "Dumber" 
"v"  (  var) 6  } 

)  "variable") 


factor.  A  ( 

"("("{'»  (exp)  &  ")"  ) 

•■#»  (tLstiffr  f  >  " (expl " 


" #"  (  aumber)  5  ) 

)  "number" 

»v"  {  var)  &  l 

)  "variable") 

var.  |j(id)  6  ) 

id.  (  (char)  +  ) 

(id  n.  (char) +  ) 

number.  ( (digit)  ♦  ) 

(num  val.  (digit) +  ) 


char .abcdef ghi jklmnopgrstuvwxyz_. . 
digit. 0123456789.. 
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